2025-09-07T07:44:40.0672345Z Current runner version: '2.328.0' 2025-09-07T07:44:40.0678829Z Runner name: 'i-03028b1668c838483-1003' 2025-09-07T07:44:40.0679677Z Runner group name: 'default' 2025-09-07T07:44:40.0680637Z Machine name: '15a98ee0aa9d' 2025-09-07T07:44:40.0683638Z ##[group]GITHUB_TOKEN Permissions 2025-09-07T07:44:40.0685994Z Contents: read 2025-09-07T07:44:40.0686574Z Metadata: read 2025-09-07T07:44:40.0687225Z ##[endgroup] 2025-09-07T07:44:40.0689346Z Secret source: Actions 2025-09-07T07:44:40.0690153Z Prepare workflow directory 2025-09-07T07:44:40.1206100Z Prepare all required actions 2025-09-07T07:44:40.1247813Z Getting action download info 2025-09-07T07:44:40.4476735Z Download action repository 'pytorch/test-infra@main' (SHA:548a4bc624d43a01cdf165a63b041f0ae014ddbd) 2025-09-07T07:44:42.9013671Z Download action repository 'pytorch/pytorch@main' (SHA:ada43ed39c80b746b4822c92640a1882619e2795) 2025-09-07T07:45:10.2769905Z Download action repository 'actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065' (SHA:a26af69be951a213d495a4c3e4e4022e16d87065) 2025-09-07T07:45:11.2572826Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-09-07T07:45:12.1490597Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-09-07T07:45:12.3063982Z Download action repository 'seemethere/upload-artifact-s3@baba72d0712b404f646cebe0730933554ebce96a' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-09-07T07:45:12.5470437Z Getting action download info 2025-09-07T07:45:12.6733231Z Download action repository 'actions/checkout@v4' (SHA:08eba0b27e820071cde6df949e0beb9ba4906955) 2025-09-07T07:45:13.1911874Z Getting action download info 2025-09-07T07:45:13.3076726Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-09-07T07:45:13.5268662Z Getting action download info 2025-09-07T07:45:13.6281110Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2025-09-07T07:45:13.7787391Z Getting action download info 2025-09-07T07:45:13.9238154Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/main (93fb23d6fae7c4e82c4239a1033e522088742634) 2025-09-07T07:45:13.9242155Z ##[group] Inputs 2025-09-07T07:45:13.9242596Z build-environment: linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T07:45:13.9248986Z test-matrix: {"include": [{"config": "inductor_huggingface_perf", "shard": 1, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 2, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 3, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 4, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 5, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 1, "num_shards": 2, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 2, "num_shards": 2, "runner": "linux.aws.a100"}]} 2025-09-07T07:45:13.9255876Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:45:13.9256888Z sync-tag: 2025-09-07T07:45:13.9257838Z timeout-minutes: 1440 2025-09-07T07:45:13.9258116Z use-gha: 2025-09-07T07:45:13.9259361Z dashboard-tag: training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T07:45:13.9260694Z s3-bucket: gha-artifacts 2025-09-07T07:45:13.9260982Z aws-role-to-assume: 2025-09-07T07:45:13.9261609Z disable-monitor: false 2025-09-07T07:45:13.9261938Z monitor-log-interval: 15 2025-09-07T07:45:13.9262245Z monitor-data-collect-interval: 4 2025-09-07T07:45:13.9262616Z ##[endgroup] 2025-09-07T07:45:13.9263090Z Complete job name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T07:45:13.9842486Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2025-09-07T07:45:13.9843543Z with: 2025-09-07T07:45:13.9844104Z github-secret: *** 2025-09-07T07:45:13.9844850Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2025-09-07T07:45:13.9845643Z activate-with-label: false 2025-09-07T07:45:13.9845928Z label: with-ssh 2025-09-07T07:45:13.9846182Z remove-existing-keys: true 2025-09-07T07:45:13.9846470Z fail-silently: true 2025-09-07T07:45:13.9846932Z env: 2025-09-07T07:45:13.9847153Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:45:13.9847420Z ##[endgroup] 2025-09-07T07:45:14.1006406Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2025-09-07T07:45:14.1008147Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2025-09-07T07:45:14.1205119Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-09-07T07:45:14.1205587Z with: 2025-09-07T07:45:14.1205815Z no-sudo: true 2025-09-07T07:45:14.1206060Z submodules: recursive 2025-09-07T07:45:14.1206397Z fetch-depth: 0 2025-09-07T07:45:14.1206718Z env: 2025-09-07T07:45:14.1206941Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:45:14.1207214Z ##[endgroup] 2025-09-07T07:45:14.1290400Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:45:14.1291432Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:45:14.1309146Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:45:14.1309570Z env: 2025-09-07T07:45:14.1309797Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:45:14.1310073Z ##[endgroup] 2025-09-07T07:45:14.1479358Z ##[group]Run actions/checkout@v4 2025-09-07T07:45:14.1479675Z with: 2025-09-07T07:45:14.1479934Z ref: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:45:14.1480279Z fetch-depth: 0 2025-09-07T07:45:14.1480530Z submodules: recursive 2025-09-07T07:45:14.1480787Z show-progress: false 2025-09-07T07:45:14.1481059Z repository: pytorch/pytorch 2025-09-07T07:45:14.1481477Z token: *** 2025-09-07T07:45:14.1481707Z ssh-strict: true 2025-09-07T07:45:14.1481951Z ssh-user: git 2025-09-07T07:45:14.1482188Z persist-credentials: true 2025-09-07T07:45:14.1482469Z clean: true 2025-09-07T07:45:14.1482720Z sparse-checkout-cone-mode: true 2025-09-07T07:45:14.1483265Z fetch-tags: false 2025-09-07T07:45:14.1483495Z lfs: false 2025-09-07T07:45:14.1483733Z set-safe-directory: true 2025-09-07T07:45:14.1484224Z env: 2025-09-07T07:45:14.1484444Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:45:14.1484696Z ##[endgroup] 2025-09-07T07:45:14.2613845Z Syncing repository: pytorch/pytorch 2025-09-07T07:45:14.2615227Z ##[group]Getting Git version info 2025-09-07T07:45:14.2615673Z Working directory is '/home/charlie/_work/pytorch/pytorch' 2025-09-07T07:45:14.2616290Z [command]/usr/bin/git version 2025-09-07T07:45:14.2619138Z git version 2.51.0 2025-09-07T07:45:14.2645168Z ##[endgroup] 2025-09-07T07:45:14.2657294Z Temporarily overriding HOME='/home/charlie/_work/_temp/37c18b03-972e-4c44-affb-1500bf1660ea' before making global git config changes 2025-09-07T07:45:14.2658365Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T07:45:14.2662734Z [command]/usr/bin/git config --global --add safe.directory /home/charlie/_work/pytorch/pytorch 2025-09-07T07:45:14.2695710Z Deleting the contents of '/home/charlie/_work/pytorch/pytorch' 2025-09-07T07:45:14.2698852Z ##[group]Initializing the repository 2025-09-07T07:45:14.2702215Z [command]/usr/bin/git init /home/charlie/_work/pytorch/pytorch 2025-09-07T07:45:14.2744935Z hint: Using 'master' as the name for the initial branch. This default branch name 2025-09-07T07:45:14.2745587Z hint: is subject to change. To configure the initial branch name to use in all 2025-09-07T07:45:14.2746184Z hint: of your new repositories, which will suppress this warning, call: 2025-09-07T07:45:14.2746602Z hint: 2025-09-07T07:45:14.2746924Z hint: git config --global init.defaultBranch 2025-09-07T07:45:14.2747287Z hint: 2025-09-07T07:45:14.2747635Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2025-09-07T07:45:14.2748253Z hint: 'development'. The just-created branch can be renamed via this command: 2025-09-07T07:45:14.2748701Z hint: 2025-09-07T07:45:14.2748929Z hint: git branch -m 2025-09-07T07:45:14.2749202Z hint: 2025-09-07T07:45:14.2749583Z hint: Disable this message with "git config set advice.defaultBranchName false" 2025-09-07T07:45:14.2750217Z Initialized empty Git repository in /home/charlie/_work/pytorch/pytorch/.git/ 2025-09-07T07:45:14.2755428Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2025-09-07T07:45:14.2790521Z ##[endgroup] 2025-09-07T07:45:14.2791024Z ##[group]Disabling automatic garbage collection 2025-09-07T07:45:14.2794045Z [command]/usr/bin/git config --local gc.auto 0 2025-09-07T07:45:14.2823305Z ##[endgroup] 2025-09-07T07:45:14.2823756Z ##[group]Setting up auth 2025-09-07T07:45:14.2832812Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T07:45:14.2860294Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T07:45:14.3083846Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T07:45:14.3109740Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T07:45:14.3310972Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:45:14.3348332Z ##[endgroup] 2025-09-07T07:45:14.3348790Z ##[group]Fetching the repository 2025-09-07T07:45:14.3355898Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-09-07T07:46:33.4890140Z From https://github.com/pytorch/pytorch 2025-09-07T07:46:33.4890668Z * [new branch] 160583 -> origin/160583 2025-09-07T07:46:33.4891188Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-09-07T07:46:33.4891878Z * [new branch] 5addvllmbuild -> origin/5addvllmbuild 2025-09-07T07:46:33.4894815Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-09-07T07:46:33.4895548Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-09-07T07:46:33.4896149Z * [new branch] ISSUE-154849 -> origin/ISSUE-154849 2025-09-07T07:46:33.4896837Z * [new branch] JackCaoG/dynamo_make_fx_non_core_aten_ops -> origin/JackCaoG/dynamo_make_fx_non_core_aten_ops 2025-09-07T07:46:33.4897672Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-09-07T07:46:33.4898252Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-09-07T07:46:33.4898900Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-09-07T07:46:33.4899508Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-09-07T07:46:33.4900100Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-09-07T07:46:33.4900676Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-09-07T07:46:33.4901281Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-09-07T07:46:33.4901866Z * [new branch] VLA_exp -> origin/VLA_exp 2025-09-07T07:46:33.4902455Z * [new branch] actually-run-mps-aot-inductor -> origin/actually-run-mps-aot-inductor 2025-09-07T07:46:33.4903199Z * [new branch] add-missing-args-normalization -> origin/add-missing-args-normalization 2025-09-07T07:46:33.4903900Z * [new branch] add-user-guide-structure -> origin/add-user-guide-structure 2025-09-07T07:46:33.4904533Z * [new branch] add-vllm-nightly-build -> origin/add-vllm-nightly-build 2025-09-07T07:46:33.4905156Z * [new branch] add_compile_benchmarking -> origin/add_compile_benchmarking 2025-09-07T07:46:33.4905813Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-09-07T07:46:33.4906339Z * [new branch] addsimde -> origin/addsimde 2025-09-07T07:46:33.4907043Z * [new branch] addvllmtest -> origin/addvllmtest 2025-09-07T07:46:33.4907577Z * [new branch] adi/acl_upgrade -> origin/adi/acl_upgrade 2025-09-07T07:46:33.4908092Z * [new branch] adi/test -> origin/adi/test 2025-09-07T07:46:33.4908594Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-09-07T07:46:33.4909110Z * [new branch] adi/test_fusions -> origin/adi/test_fusions 2025-09-07T07:46:33.4909667Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-09-07T07:46:33.4910358Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-09-07T07:46:33.4910956Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-09-07T07:46:33.4911514Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-09-07T07:46:33.4912104Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-09-07T07:46:33.4912724Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-09-07T07:46:33.4913289Z * [new branch] alt-disable -> origin/alt-disable 2025-09-07T07:46:33.4913900Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-09-07T07:46:33.4914570Z * [new branch] angelayi/aoti_inductor_fx -> origin/angelayi/aoti_inductor_fx 2025-09-07T07:46:33.4915156Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-09-07T07:46:33.4915736Z * [new branch] angelayi/benchmark2 -> origin/angelayi/benchmark2 2025-09-07T07:46:33.4916548Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-09-07T07:46:33.4917247Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-09-07T07:46:33.4917867Z * [new branch] angelayi/custom_op_subgraph -> origin/angelayi/custom_op_subgraph 2025-09-07T07:46:33.4918462Z * [new branch] angelayi/customop -> origin/angelayi/customop 2025-09-07T07:46:33.4919061Z * [new branch] angelayi/fake_cache_empty -> origin/angelayi/fake_cache_empty 2025-09-07T07:46:33.4919706Z * [new branch] angelayi/is_symbolic_tracing -> origin/angelayi/is_symbolic_tracing 2025-09-07T07:46:33.4920295Z * [new branch] angelayi/item -> origin/angelayi/item 2025-09-07T07:46:33.4920846Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-09-07T07:46:33.4921424Z * [new branch] angelayi/opoverload -> origin/angelayi/opoverload 2025-09-07T07:46:33.4922005Z * [new branch] angelayi/pattern -> origin/angelayi/pattern 2025-09-07T07:46:33.4922562Z * [new branch] angelayi/pytree -> origin/angelayi/pytree 2025-09-07T07:46:33.4923313Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-09-07T07:46:33.4923905Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-09-07T07:46:33.4924474Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-09-07T07:46:33.4925041Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-09-07T07:46:33.4925605Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-09-07T07:46:33.4926154Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-09-07T07:46:33.4926703Z * [new branch] aoti_weight_sharing -> origin/aoti_weight_sharing 2025-09-07T07:46:33.4927334Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-09-07T07:46:33.4955320Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-09-07T07:46:33.4956232Z * [new branch] atalman-patch-1 -> origin/atalman-patch-1 2025-09-07T07:46:33.4956791Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-09-07T07:46:33.4957360Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-09-07T07:46:33.4957908Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-09-07T07:46:33.4958455Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-09-07T07:46:33.4959030Z * [new branch] atalman_inductor_2.3.0 -> origin/atalman_inductor_2.3.0 2025-09-07T07:46:33.4959715Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-09-07T07:46:33.4960318Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-09-07T07:46:33.4960920Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-09-07T07:46:33.4961648Z * [new branch] autoupdate-transformers-pin-via-pr -> origin/autoupdate-transformers-pin-via-pr 2025-09-07T07:46:33.4962367Z * [new branch] bahuang/dtensor_demo -> origin/bahuang/dtensor_demo 2025-09-07T07:46:33.4963095Z * [new branch] bahuang/test -> origin/bahuang/test 2025-09-07T07:46:33.4963588Z * [new branch] base/1.5 -> origin/base/1.5 2025-09-07T07:46:33.4964213Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-09-07T07:46:33.4964856Z * [new branch] bc-lint-config -> origin/bc-lint-config 2025-09-07T07:46:33.4965586Z * [new branch] bc-lint-test-new-config -> origin/bc-lint-test-new-config 2025-09-07T07:46:33.4966192Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-09-07T07:46:33.4966835Z * [new branch] benchmarker_compat_with_do_bench -> origin/benchmarker_compat_with_do_bench 2025-09-07T07:46:33.4967527Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-09-07T07:46:33.4968163Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-09-07T07:46:33.4968737Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-09-07T07:46:33.4969317Z * [new branch] bf/cg-custom-wrapper -> origin/bf/cg-custom-wrapper 2025-09-07T07:46:33.4969852Z * [new branch] bf/cg-or-error -> origin/bf/cg-or-error 2025-09-07T07:46:33.4970393Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-09-07T07:46:33.4970958Z * [new branch] bf/cg-skip-1-kernel -> origin/bf/cg-skip-1-kernel 2025-09-07T07:46:33.4971508Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-09-07T07:46:33.4972189Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-09-07T07:46:33.4973205Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-09-07T07:46:33.4974127Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-09-07T07:46:33.4974788Z * [new branch] bf/default-recompile-reason -> origin/bf/default-recompile-reason 2025-09-07T07:46:33.4975440Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-09-07T07:46:33.4975983Z * [new branch] bf/exp -> origin/bf/exp 2025-09-07T07:46:33.4976514Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-09-07T07:46:33.4977222Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-09-07T07:46:33.4977934Z * [new branch] bf/partition-turn-on -> origin/bf/partition-turn-on 2025-09-07T07:46:33.4978550Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-09-07T07:46:33.4979114Z * [new branch] bf/rope -> origin/bf/rope 2025-09-07T07:46:33.4979675Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-09-07T07:46:33.4980328Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-09-07T07:46:33.4980972Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-09-07T07:46:33.4981852Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-09-07T07:46:33.4982483Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-09-07T07:46:33.4983121Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-09-07T07:46:33.4983751Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-09-07T07:46:33.4984398Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-09-07T07:46:33.4985037Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-09-07T07:46:33.4985678Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-09-07T07:46:33.4986305Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-09-07T07:46:33.4987021Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-09-07T07:46:33.4987660Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-09-07T07:46:33.4988321Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-09-07T07:46:33.4988968Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-09-07T07:46:33.4989604Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-09-07T07:46:33.4990232Z * [new branch] bowbao/bench_updates_stage -> origin/bowbao/bench_updates_stage 2025-09-07T07:46:33.4990836Z * [new branch] bowbao/dort_rewriter -> origin/bowbao/dort_rewriter 2025-09-07T07:46:33.4991389Z * [new branch] bowbao/wip_prs -> origin/bowbao/wip_prs 2025-09-07T07:46:33.4991963Z * [new branch] brister/break_tensorbox -> origin/brister/break_tensorbox 2025-09-07T07:46:33.4992585Z * [new branch] brister/custom_fx_backend -> origin/brister/custom_fx_backend 2025-09-07T07:46:33.4993193Z * [new branch] brister/fx_custom_triton -> origin/brister/fx_custom_triton 2025-09-07T07:46:33.4993808Z * [new branch] brister/tensor_box_output -> origin/brister/tensor_box_output 2025-09-07T07:46:33.4994530Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-09-07T07:46:33.4995182Z * [new branch] c57382a49 -> origin/c57382a49 2025-09-07T07:46:33.4995676Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-09-07T07:46:33.4996173Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-09-07T07:46:33.4997200Z * [new branch] camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 -> origin/camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 2025-09-07T07:46:33.4998295Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-09-07T07:46:33.4999166Z * [new branch] cherry-pick-149654-by-pytorch_bot_bot_ -> origin/cherry-pick-149654-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5000010Z * [new branch] cherry-pick-151939-by-pytorch_bot_bot_ -> origin/cherry-pick-151939-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5000824Z * [new branch] cherry-pick-154174-by-pytorch_bot_bot_ -> origin/cherry-pick-154174-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5001656Z * [new branch] cherry-pick-156260-by-pytorch_bot_bot_ -> origin/cherry-pick-156260-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5002483Z * [new branch] cherry-pick-157453-by-pytorch_bot_bot_ -> origin/cherry-pick-157453-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5003445Z * [new branch] cherry-pick-157513-by-pytorch_bot_bot_ -> origin/cherry-pick-157513-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5004280Z * [new branch] cherry-pick-157695-by-pytorch_bot_bot_ -> origin/cherry-pick-157695-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5005096Z * [new branch] cherry-pick-157732-by-pytorch_bot_bot_ -> origin/cherry-pick-157732-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5005929Z * [new branch] cherry-pick-158537-by-pytorch_bot_bot_ -> origin/cherry-pick-158537-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5006751Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5007572Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-09-07T07:46:33.5008261Z * [new branch] chilli/flex_vllm -> origin/chilli/flex_vllm 2025-09-07T07:46:33.5008945Z * [new branch] cleanup-inductor-benchmark-images -> origin/cleanup-inductor-benchmark-images 2025-09-07T07:46:33.5009763Z * [new branch] codex-testing -> origin/codex-testing 2025-09-07T07:46:33.5010492Z * [new branch] codex/add-helper-function-to-sizevars.py -> origin/codex/add-helper-function-to-sizevars.py 2025-09-07T07:46:33.5011487Z * [new branch] codex/add-helper-function-to-sizevars.py_2025-09-05 -> origin/codex/add-helper-function-to-sizevars.py_2025-09-05 2025-09-07T07:46:33.5012464Z * [new branch] codex/add-metadata-field-for-file-path -> origin/codex/add-metadata-field-for-file-path 2025-09-07T07:46:33.5013440Z * [new branch] codex/add-test-for-inductor-local-cache-behavior -> origin/codex/add-test-for-inductor-local-cache-behavior 2025-09-07T07:46:33.5014575Z * [new branch] codex/create-test-for-tensor-memory-leak-in-cudagraph -> origin/codex/create-test-for-tensor-memory-leak-in-cudagraph 2025-09-07T07:46:33.5015544Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-09-07T07:46:33.5016301Z * [new branch] codex/fix-issue-160415-in-pytorch -> origin/codex/fix-issue-160415-in-pytorch 2025-09-07T07:46:33.5017184Z * [new branch] codex/fix-noqengine-quantized-engine-support -> origin/codex/fix-noqengine-quantized-engine-support 2025-09-07T07:46:33.5018164Z * [new branch] codex/fix-pin_memory-error-handling -> origin/codex/fix-pin_memory-error-handling 2025-09-07T07:46:33.5018963Z * [new branch] codex/propose-fix-for-issue-160332 -> origin/codex/propose-fix-for-issue-160332 2025-09-07T07:46:33.5019864Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-09-07T07:46:33.5020958Z * [new branch] codex/remove-allow-untyped-defs-and-fix-type-errors -> origin/codex/remove-allow-untyped-defs-and-fix-type-errors 2025-09-07T07:46:33.5021938Z * [new branch] compile_fsdp2_disable_stream_and_event -> origin/compile_fsdp2_disable_stream_and_event 2025-09-07T07:46:33.5022608Z * [new branch] context_test -> origin/context_test 2025-09-07T07:46:33.5023243Z * [new branch] copilot/fix-157446 -> origin/copilot/fix-157446 2025-09-07T07:46:33.5023770Z * [new branch] copy_graph -> origin/copy_graph 2025-09-07T07:46:33.5024310Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-09-07T07:46:33.5024898Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-09-07T07:46:33.5025510Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-09-07T07:46:33.5026132Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-09-07T07:46:33.5026753Z * [new branch] csl/disable_flaky_cpp_test -> origin/csl/disable_flaky_cpp_test 2025-09-07T07:46:33.5027387Z * [new branch] csl/disable_periodic_test -> origin/csl/disable_periodic_test 2025-09-07T07:46:33.5028044Z * [new branch] csl/exclude_rocm_viable_strict -> origin/csl/exclude_rocm_viable_strict 2025-09-07T07:46:33.5028638Z * [new branch] csl/katex -> origin/csl/katex 2025-09-07T07:46:33.5029157Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-09-07T07:46:33.5029714Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-09-07T07:46:33.5030281Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-09-07T07:46:33.5030843Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-09-07T07:46:33.5031439Z * [new branch] csl/name_link_check_job -> origin/csl/name_link_check_job 2025-09-07T07:46:33.5032008Z * [new branch] csl/no_keep_goin_rocm -> origin/csl/no_keep_goin_rocm 2025-09-07T07:46:33.5032683Z * [new branch] csl/not_600_timeout -> origin/csl/not_600_timeout 2025-09-07T07:46:33.5033243Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-09-07T07:46:33.5033781Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-09-07T07:46:33.5034404Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-09-07T07:46:33.5035020Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-09-07T07:46:33.5035556Z * [new branch] cublasltrelax2 -> origin/cublasltrelax2 2025-09-07T07:46:33.5036086Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-09-07T07:46:33.5036628Z * [new branch] cudnnsdparefactor -> origin/cudnnsdparefactor 2025-09-07T07:46:33.5037207Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-09-07T07:46:33.5037752Z * [new branch] czhuge_muon_dev -> origin/czhuge_muon_dev 2025-09-07T07:46:33.5038299Z * [new branch] d4l3k/delete_hook -> origin/d4l3k/delete_hook 2025-09-07T07:46:33.5038816Z * [new branch] dcp_zoc -> origin/dcp_zoc 2025-09-07T07:46:33.5039307Z * [new branch] debug-guard -> origin/debug-guard 2025-09-07T07:46:33.5039842Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-09-07T07:46:33.5040816Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.2 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.2 2025-09-07T07:46:33.5042208Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.3 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.3 2025-09-07T07:46:33.5043730Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.4 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.4 2025-09-07T07:46:33.5045771Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.56.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.56.0 2025-09-07T07:46:33.5046967Z * [new branch] dependabot/pip/dot-ci/docker/protobuf-5.29.5 -> origin/dependabot/pip/dot-ci/docker/protobuf-5.29.5 2025-09-07T07:46:33.5048024Z * [new branch] dependabot/pip/dot-github/requirements/protobuf-5.29.5 -> origin/dependabot/pip/dot-github/requirements/protobuf-5.29.5 2025-09-07T07:46:33.5048938Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-09-07T07:46:33.5049661Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-09-07T07:46:33.5050366Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-09-07T07:46:33.5050977Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-09-07T07:46:33.5051557Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-09-07T07:46:33.5052130Z * [new branch] dev/joona/cat_remove_graph -> origin/dev/joona/cat_remove_graph 2025-09-07T07:46:33.5052752Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-09-07T07:46:33.5053389Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-09-07T07:46:33.5054143Z * [new branch] dev/joona/maxpool2dwithindices_errmsg -> origin/dev/joona/maxpool2dwithindices_errmsg 2025-09-07T07:46:33.5054893Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-09-07T07:46:33.5055489Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-09-07T07:46:33.5056157Z * [new branch] dev/joona/topk_newapi -> origin/dev/joona/topk_newapi 2025-09-07T07:46:33.5056742Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-09-07T07:46:33.5057380Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-09-07T07:46:33.5057913Z * [new branch] disable -> origin/disable 2025-09-07T07:46:33.5058414Z * [new branch] e2e-baseline -> origin/e2e-baseline 2025-09-07T07:46:33.5058979Z * [new branch] eigen_for_sparse_addmm_v2 -> origin/eigen_for_sparse_addmm_v2 2025-09-07T07:46:33.5059597Z * [new branch] embg/test_inductor_ci_128B -> origin/embg/test_inductor_ci_128B 2025-09-07T07:46:33.5060206Z * [new branch] embg/test_inductor_ci_base -> origin/embg/test_inductor_ci_base 2025-09-07T07:46:33.5060847Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-09-07T07:46:33.5061512Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-09-07T07:46:33.5062167Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-09-07T07:46:33.5062754Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-09-07T07:46:33.5063253Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-09-07T07:46:33.5063757Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-09-07T07:46:33.5064259Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-09-07T07:46:33.5064838Z * [new branch] example-convert-torch.nn -> origin/example-convert-torch.nn 2025-09-07T07:46:33.5065590Z * [new branch] exclamaforte/add-contiguous-threshold -> origin/exclamaforte/add-contiguous-threshold 2025-09-07T07:46:33.5066327Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-09-07T07:46:33.5067052Z * [new branch] exclamaforte/bump-transformer-version -> origin/exclamaforte/bump-transformer-version 2025-09-07T07:46:33.5067981Z * [new branch] exclamaforte/clear-feedback-savers -> origin/exclamaforte/clear-feedback-savers 2025-09-07T07:46:33.5068800Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-09-07T07:46:33.5069572Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-09-07T07:46:33.5070310Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-09-07T07:46:33.5071151Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-09-07T07:46:33.5072110Z * [new branch] exclamaforte/fix-exhuastive-autotuning-reland -> origin/exclamaforte/fix-exhuastive-autotuning-reland 2025-09-07T07:46:33.5073047Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-09-07T07:46:33.5073941Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-09-07T07:46:33.5074719Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-09-07T07:46:33.5075416Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-09-07T07:46:33.5076149Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-09-07T07:46:33.5076822Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-09-07T07:46:33.5077628Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-09-07T07:46:33.5078545Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-09-07T07:46:33.5079218Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-09-07T07:46:33.5080018Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-09-07T07:46:33.5080836Z * [new branch] exclamaforte/max-autotune-ieee -> origin/exclamaforte/max-autotune-ieee 2025-09-07T07:46:33.5081540Z * [new branch] exclamaforte/memory-counter -> origin/exclamaforte/memory-counter 2025-09-07T07:46:33.5082233Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-09-07T07:46:33.5083124Z * [new branch] exclamaforte/profiler-combo -> origin/exclamaforte/profiler-combo 2025-09-07T07:46:33.5083863Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-09-07T07:46:33.5084674Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-09-07T07:46:33.5085539Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-09-07T07:46:33.5086309Z * [new branch] exclamforte/gemm-model-final -> origin/exclamforte/gemm-model-final 2025-09-07T07:46:33.5086881Z * [new branch] exec -> origin/exec 2025-09-07T07:46:33.5087430Z * [new branch] executorch-module-shim -> origin/executorch-module-shim 2025-09-07T07:46:33.5088041Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-09-07T07:46:33.5088609Z * [new branch] export-D58091437 -> origin/export-D58091437 2025-09-07T07:46:33.5089142Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-09-07T07:46:33.5089683Z * [new branch] export-D70112642 -> origin/export-D70112642 2025-09-07T07:46:33.5090217Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-09-07T07:46:33.5090877Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-09-07T07:46:33.5091426Z * [new branch] export-D75183591 -> origin/export-D75183591 2025-09-07T07:46:33.5091956Z * [new branch] export-D75617432 -> origin/export-D75617432 2025-09-07T07:46:33.5092495Z * [new branch] export-D75659965 -> origin/export-D75659965 2025-09-07T07:46:33.5093015Z * [new branch] export-D76080931 -> origin/export-D76080931 2025-09-07T07:46:33.5093544Z * [new branch] export-D76797250 -> origin/export-D76797250 2025-09-07T07:46:33.5094068Z * [new branch] export-D76885271 -> origin/export-D76885271 2025-09-07T07:46:33.5094591Z * [new branch] export-D76885620 -> origin/export-D76885620 2025-09-07T07:46:33.5095117Z * [new branch] export-D76936623 -> origin/export-D76936623 2025-09-07T07:46:33.5095639Z * [new branch] export-D76958268 -> origin/export-D76958268 2025-09-07T07:46:33.5096171Z * [new branch] export-D78375400 -> origin/export-D78375400 2025-09-07T07:46:33.5096690Z * [new branch] export-D78431305 -> origin/export-D78431305 2025-09-07T07:46:33.5097227Z * [new branch] export-D78580107 -> origin/export-D78580107 2025-09-07T07:46:33.5097859Z * [new branch] export-D78822171 -> origin/export-D78822171 2025-09-07T07:46:33.5098398Z * [new branch] export-D78822351 -> origin/export-D78822351 2025-09-07T07:46:33.5098934Z * [new branch] export-D78822507 -> origin/export-D78822507 2025-09-07T07:46:33.5099474Z * [new branch] export-D78826994 -> origin/export-D78826994 2025-09-07T07:46:33.5100149Z * [new branch] export-D78894324 -> origin/export-D78894324 2025-09-07T07:46:33.5100691Z * [new branch] export-D78929245 -> origin/export-D78929245 2025-09-07T07:46:33.5101238Z * [new branch] export-D78934925 -> origin/export-D78934925 2025-09-07T07:46:33.5101777Z * [new branch] export-D78953203 -> origin/export-D78953203 2025-09-07T07:46:33.5102303Z * [new branch] export-D78953229 -> origin/export-D78953229 2025-09-07T07:46:33.5102840Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-09-07T07:46:33.5103384Z * [new branch] export-D78957389 -> origin/export-D78957389 2025-09-07T07:46:33.5103922Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-09-07T07:46:33.5104462Z * [new branch] export-D79026433 -> origin/export-D79026433 2025-09-07T07:46:33.5104992Z * [new branch] export-D79230339 -> origin/export-D79230339 2025-09-07T07:46:33.5105534Z * [new branch] export-D79319835 -> origin/export-D79319835 2025-09-07T07:46:33.5106074Z * [new branch] export-D79328456 -> origin/export-D79328456 2025-09-07T07:46:33.5106617Z * [new branch] export-D79534608 -> origin/export-D79534608 2025-09-07T07:46:33.5107156Z * [new branch] export-D79785974 -> origin/export-D79785974 2025-09-07T07:46:33.5107684Z * [new branch] export-D80025417 -> origin/export-D80025417 2025-09-07T07:46:33.5108219Z * [new branch] export-D80120333 -> origin/export-D80120333 2025-09-07T07:46:33.5108758Z * [new branch] export-D80214882 -> origin/export-D80214882 2025-09-07T07:46:33.5109283Z * [new branch] export-D80319069 -> origin/export-D80319069 2025-09-07T07:46:33.5109809Z * [new branch] export-D80321215 -> origin/export-D80321215 2025-09-07T07:46:33.5110445Z * [new branch] export-D80503451 -> origin/export-D80503451 2025-09-07T07:46:33.5110993Z * [new branch] export-D80771648 -> origin/export-D80771648 2025-09-07T07:46:33.5111533Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-09-07T07:46:33.5112070Z * [new branch] export-D80948073 -> origin/export-D80948073 2025-09-07T07:46:33.5112601Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-09-07T07:46:33.5113150Z * [new branch] export-D80970483 -> origin/export-D80970483 2025-09-07T07:46:33.5113690Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-09-07T07:46:33.5114216Z * [new branch] export-D81060182 -> origin/export-D81060182 2025-09-07T07:46:33.5114747Z * [new branch] export-D81078973 -> origin/export-D81078973 2025-09-07T07:46:33.5115261Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-09-07T07:46:33.5115777Z * [new branch] export-D81284190 -> origin/export-D81284190 2025-09-07T07:46:33.5116306Z * [new branch] export-D81299840 -> origin/export-D81299840 2025-09-07T07:46:33.5116840Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-09-07T07:46:33.5117367Z * [new branch] export-D81698719 -> origin/export-D81698719 2025-09-07T07:46:33.5117882Z * [new branch] export-D81747409 -> origin/export-D81747409 2025-09-07T07:46:33.5118527Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-09-07T07:46:33.5119231Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-09-07T07:46:33.5119897Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-09-07T07:46:33.5120431Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-09-07T07:46:33.5120937Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-09-07T07:46:33.5121401Z * [new branch] fca -> origin/fca 2025-09-07T07:46:33.5121860Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-09-07T07:46:33.5122334Z * [new branch] fca5 -> origin/fca5 2025-09-07T07:46:33.5123078Z * [new branch] feature/function-numa-binding -> origin/feature/function-numa-binding 2025-09-07T07:46:33.5123849Z * [new branch] feature/function-numa-binding-take2 -> origin/feature/function-numa-binding-take2 2025-09-07T07:46:33.5124572Z * [new branch] feature/numa-nproc-fix -> origin/feature/numa-nproc-fix 2025-09-07T07:46:33.5125264Z * [new branch] feature/numa-signpost-serialize -> origin/feature/numa-signpost-serialize 2025-09-07T07:46:33.5126008Z * [new branch] feature/parallel-numa-binding -> origin/feature/parallel-numa-binding 2025-09-07T07:46:33.5126685Z * [new branch] fengyuan/external-proj -> origin/fengyuan/external-proj 2025-09-07T07:46:33.5127451Z * [new branch] fengyuan/out-of-tree-xpu-ops-improve-test -> origin/fengyuan/out-of-tree-xpu-ops-improve-test 2025-09-07T07:46:33.5128378Z * [new branch] fengyuan/out-of-tree-xpu-ops-remove-dtype -> origin/fengyuan/out-of-tree-xpu-ops-remove-dtype 2025-09-07T07:46:33.5129123Z * [new branch] fengyuan/test-xpu -> origin/fengyuan/test-xpu 2025-09-07T07:46:33.5129685Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-09-07T07:46:33.5130242Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-09-07T07:46:33.5130786Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-09-07T07:46:33.5131483Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-09-07T07:46:33.5132067Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-09-07T07:46:33.5132667Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-09-07T07:46:33.5133255Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-09-07T07:46:33.5133870Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-09-07T07:46:33.5134502Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-09-07T07:46:33.5135114Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-09-07T07:46:33.5135643Z * [new branch] fix -> origin/fix 2025-09-07T07:46:33.5136196Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-09-07T07:46:33.5136818Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-09-07T07:46:33.5137445Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-09-07T07:46:33.5138059Z * [new branch] fix-inductor-periodic-0528 -> origin/fix-inductor-periodic-0528 2025-09-07T07:46:33.5138676Z * [new branch] fix-mps-benchmark -> origin/fix-mps-benchmark 2025-09-07T07:46:33.5139291Z * [new branch] fix-rlease-feature-template -> origin/fix-rlease-feature-template 2025-09-07T07:46:33.5140024Z * [new branch] fix-run-condition-upload-results -> origin/fix-run-condition-upload-results 2025-09-07T07:46:33.5140686Z * [new branch] fix-torchbench -> origin/fix-torchbench 2025-09-07T07:46:33.5141339Z * [new branch] fix_153389 -> origin/fix_153389 2025-09-07T07:46:33.5141866Z * [new branch] fix_fsdp_rs_bucket2 -> origin/fix_fsdp_rs_bucket2 2025-09-07T07:46:33.5142452Z * [new branch] fix_inductor_peridic_tests -> origin/fix_inductor_peridic_tests 2025-09-07T07:46:33.5143019Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-09-07T07:46:33.5143535Z * [new branch] fixes-triage -> origin/fixes-triage 2025-09-07T07:46:33.5144054Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-09-07T07:46:33.5144592Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-09-07T07:46:33.5145109Z * [new branch] flex-flash -> origin/flex-flash 2025-09-07T07:46:33.5145613Z * [new branch] flex-lowering -> origin/flex-lowering 2025-09-07T07:46:33.5146131Z * [new branch] flex-warning -> origin/flex-warning 2025-09-07T07:46:33.5146732Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-09-07T07:46:33.5147313Z * [new branch] flex_flash -> origin/flex_flash 2025-09-07T07:46:33.5147856Z * [new branch] flexdecode-gqa-groups -> origin/flexdecode-gqa-groups 2025-09-07T07:46:33.5148517Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-09-07T07:46:33.5149139Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-09-07T07:46:33.5149661Z * [new branch] fsdpv2_3d -> origin/fsdpv2_3d 2025-09-07T07:46:33.5150153Z * [new branch] fsdpv2_3d_m1 -> origin/fsdpv2_3d_m1 2025-09-07T07:46:33.5150646Z * [new branch] fx_cpp -> origin/fx_cpp 2025-09-07T07:46:33.5151129Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-09-07T07:46:33.5151662Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-09-07T07:46:33.5152298Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-09-07T07:46:33.5152821Z * [new branch] gh/CaoE/2/base -> origin/gh/CaoE/2/base 2025-09-07T07:46:33.5153346Z * [new branch] gh/CaoE/2/head -> origin/gh/CaoE/2/head 2025-09-07T07:46:33.5153865Z * [new branch] gh/CaoE/2/orig -> origin/gh/CaoE/2/orig 2025-09-07T07:46:33.5154441Z * [new branch] gh/ColinPeppler/79/base -> origin/gh/ColinPeppler/79/base 2025-09-07T07:46:33.5155070Z * [new branch] gh/ColinPeppler/79/head -> origin/gh/ColinPeppler/79/head 2025-09-07T07:46:33.5155686Z * [new branch] gh/ColinPeppler/79/orig -> origin/gh/ColinPeppler/79/orig 2025-09-07T07:46:33.5156313Z * [new branch] gh/ColinPeppler/80/base -> origin/gh/ColinPeppler/80/base 2025-09-07T07:46:33.5156935Z * [new branch] gh/ColinPeppler/80/head -> origin/gh/ColinPeppler/80/head 2025-09-07T07:46:33.5157556Z * [new branch] gh/ColinPeppler/80/orig -> origin/gh/ColinPeppler/80/orig 2025-09-07T07:46:33.5158158Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-09-07T07:46:33.5158731Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-09-07T07:46:33.5159313Z * [new branch] gh/EikanWang/80/base -> origin/gh/EikanWang/80/base 2025-09-07T07:46:33.5159895Z * [new branch] gh/EikanWang/80/head -> origin/gh/EikanWang/80/head 2025-09-07T07:46:33.5160475Z * [new branch] gh/EikanWang/80/orig -> origin/gh/EikanWang/80/orig 2025-09-07T07:46:33.5161054Z * [new branch] gh/EikanWang/81/base -> origin/gh/EikanWang/81/base 2025-09-07T07:46:33.5161716Z * [new branch] gh/EikanWang/81/head -> origin/gh/EikanWang/81/head 2025-09-07T07:46:33.5162306Z * [new branch] gh/EikanWang/81/orig -> origin/gh/EikanWang/81/orig 2025-09-07T07:46:33.5163044Z * [new branch] gh/EikanWang/82/base -> origin/gh/EikanWang/82/base 2025-09-07T07:46:33.5163642Z * [new branch] gh/EikanWang/82/head -> origin/gh/EikanWang/82/head 2025-09-07T07:46:33.5164220Z * [new branch] gh/EikanWang/82/orig -> origin/gh/EikanWang/82/orig 2025-09-07T07:46:33.5164786Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-09-07T07:46:33.5165352Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-09-07T07:46:33.5165916Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-09-07T07:46:33.5166467Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-09-07T07:46:33.5167008Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-09-07T07:46:33.5167560Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-09-07T07:46:33.5168107Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-09-07T07:46:33.5168658Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-09-07T07:46:33.5169205Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-09-07T07:46:33.5169739Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-09-07T07:46:33.5170285Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-09-07T07:46:33.5170835Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-09-07T07:46:33.5171386Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-09-07T07:46:33.5171939Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-09-07T07:46:33.5172602Z * [new branch] gh/H-Huang/187/base -> origin/gh/H-Huang/187/base 2025-09-07T07:46:33.5173157Z * [new branch] gh/H-Huang/187/head -> origin/gh/H-Huang/187/head 2025-09-07T07:46:33.5173706Z * [new branch] gh/H-Huang/187/orig -> origin/gh/H-Huang/187/orig 2025-09-07T07:46:33.5174255Z * [new branch] gh/H-Huang/202/base -> origin/gh/H-Huang/202/base 2025-09-07T07:46:33.5174807Z * [new branch] gh/H-Huang/202/head -> origin/gh/H-Huang/202/head 2025-09-07T07:46:33.5175353Z * [new branch] gh/H-Huang/202/orig -> origin/gh/H-Huang/202/orig 2025-09-07T07:46:33.5175907Z * [new branch] gh/H-Huang/203/base -> origin/gh/H-Huang/203/base 2025-09-07T07:46:33.5176470Z * [new branch] gh/H-Huang/203/head -> origin/gh/H-Huang/203/head 2025-09-07T07:46:33.5177021Z * [new branch] gh/H-Huang/203/orig -> origin/gh/H-Huang/203/orig 2025-09-07T07:46:33.5177659Z * [new branch] gh/H-Huang/204/base -> origin/gh/H-Huang/204/base 2025-09-07T07:46:33.5178217Z * [new branch] gh/H-Huang/204/head -> origin/gh/H-Huang/204/head 2025-09-07T07:46:33.5178771Z * [new branch] gh/H-Huang/204/orig -> origin/gh/H-Huang/204/orig 2025-09-07T07:46:33.5179326Z * [new branch] gh/H-Huang/205/base -> origin/gh/H-Huang/205/base 2025-09-07T07:46:33.5179880Z * [new branch] gh/H-Huang/205/head -> origin/gh/H-Huang/205/head 2025-09-07T07:46:33.5180419Z * [new branch] gh/H-Huang/205/orig -> origin/gh/H-Huang/205/orig 2025-09-07T07:46:33.5180968Z * [new branch] gh/H-Huang/206/base -> origin/gh/H-Huang/206/base 2025-09-07T07:46:33.5181649Z * [new branch] gh/H-Huang/206/head -> origin/gh/H-Huang/206/head 2025-09-07T07:46:33.5182208Z * [new branch] gh/H-Huang/206/orig -> origin/gh/H-Huang/206/orig 2025-09-07T07:46:33.5182762Z * [new branch] gh/H-Huang/207/base -> origin/gh/H-Huang/207/base 2025-09-07T07:46:33.5183299Z * [new branch] gh/H-Huang/207/head -> origin/gh/H-Huang/207/head 2025-09-07T07:46:33.5183848Z * [new branch] gh/H-Huang/207/orig -> origin/gh/H-Huang/207/orig 2025-09-07T07:46:33.5184399Z * [new branch] gh/H-Huang/208/base -> origin/gh/H-Huang/208/base 2025-09-07T07:46:33.5184951Z * [new branch] gh/H-Huang/208/head -> origin/gh/H-Huang/208/head 2025-09-07T07:46:33.5185502Z * [new branch] gh/H-Huang/208/orig -> origin/gh/H-Huang/208/orig 2025-09-07T07:46:33.5186038Z * [new branch] gh/H-Huang/209/base -> origin/gh/H-Huang/209/base 2025-09-07T07:46:33.5186598Z * [new branch] gh/H-Huang/209/head -> origin/gh/H-Huang/209/head 2025-09-07T07:46:33.5187145Z * [new branch] gh/H-Huang/209/orig -> origin/gh/H-Huang/209/orig 2025-09-07T07:46:33.5187694Z * [new branch] gh/H-Huang/210/base -> origin/gh/H-Huang/210/base 2025-09-07T07:46:33.5188242Z * [new branch] gh/H-Huang/210/head -> origin/gh/H-Huang/210/head 2025-09-07T07:46:33.5188772Z * [new branch] gh/H-Huang/210/orig -> origin/gh/H-Huang/210/orig 2025-09-07T07:46:33.5189315Z * [new branch] gh/H-Huang/211/base -> origin/gh/H-Huang/211/base 2025-09-07T07:46:33.5189854Z * [new branch] gh/H-Huang/211/head -> origin/gh/H-Huang/211/head 2025-09-07T07:46:33.5190390Z * [new branch] gh/H-Huang/211/orig -> origin/gh/H-Huang/211/orig 2025-09-07T07:46:33.5190924Z * [new branch] gh/H-Huang/212/base -> origin/gh/H-Huang/212/base 2025-09-07T07:46:33.5191456Z * [new branch] gh/H-Huang/212/head -> origin/gh/H-Huang/212/head 2025-09-07T07:46:33.5192096Z * [new branch] gh/H-Huang/212/orig -> origin/gh/H-Huang/212/orig 2025-09-07T07:46:33.5192640Z * [new branch] gh/H-Huang/213/base -> origin/gh/H-Huang/213/base 2025-09-07T07:46:33.5193178Z * [new branch] gh/H-Huang/213/head -> origin/gh/H-Huang/213/head 2025-09-07T07:46:33.5193708Z * [new branch] gh/H-Huang/213/orig -> origin/gh/H-Huang/213/orig 2025-09-07T07:46:33.5194248Z * [new branch] gh/H-Huang/214/base -> origin/gh/H-Huang/214/base 2025-09-07T07:46:33.5194784Z * [new branch] gh/H-Huang/214/head -> origin/gh/H-Huang/214/head 2025-09-07T07:46:33.5195323Z * [new branch] gh/H-Huang/214/orig -> origin/gh/H-Huang/214/orig 2025-09-07T07:46:33.5195908Z * [new branch] gh/IvanKobzarev/112/base -> origin/gh/IvanKobzarev/112/base 2025-09-07T07:46:33.5196526Z * [new branch] gh/IvanKobzarev/112/head -> origin/gh/IvanKobzarev/112/head 2025-09-07T07:46:33.5197142Z * [new branch] gh/IvanKobzarev/112/orig -> origin/gh/IvanKobzarev/112/orig 2025-09-07T07:46:33.5197760Z * [new branch] gh/IvanKobzarev/115/base -> origin/gh/IvanKobzarev/115/base 2025-09-07T07:46:33.5198377Z * [new branch] gh/IvanKobzarev/115/head -> origin/gh/IvanKobzarev/115/head 2025-09-07T07:46:33.5198993Z * [new branch] gh/IvanKobzarev/115/orig -> origin/gh/IvanKobzarev/115/orig 2025-09-07T07:46:33.5199603Z * [new branch] gh/IvanKobzarev/116/base -> origin/gh/IvanKobzarev/116/base 2025-09-07T07:46:33.5200218Z * [new branch] gh/IvanKobzarev/116/head -> origin/gh/IvanKobzarev/116/head 2025-09-07T07:46:33.5200831Z * [new branch] gh/IvanKobzarev/116/orig -> origin/gh/IvanKobzarev/116/orig 2025-09-07T07:46:33.5201551Z * [new branch] gh/IvanKobzarev/118/base -> origin/gh/IvanKobzarev/118/base 2025-09-07T07:46:33.5202175Z * [new branch] gh/IvanKobzarev/118/head -> origin/gh/IvanKobzarev/118/head 2025-09-07T07:46:33.5202781Z * [new branch] gh/IvanKobzarev/118/orig -> origin/gh/IvanKobzarev/118/orig 2025-09-07T07:46:33.5203573Z * [new branch] gh/IvanKobzarev/126/base -> origin/gh/IvanKobzarev/126/base 2025-09-07T07:46:33.5204194Z * [new branch] gh/IvanKobzarev/126/head -> origin/gh/IvanKobzarev/126/head 2025-09-07T07:46:33.5204812Z * [new branch] gh/IvanKobzarev/126/orig -> origin/gh/IvanKobzarev/126/orig 2025-09-07T07:46:33.5205504Z * [new branch] gh/IvanKobzarev/127/base -> origin/gh/IvanKobzarev/127/base 2025-09-07T07:46:33.5206115Z * [new branch] gh/IvanKobzarev/127/head -> origin/gh/IvanKobzarev/127/head 2025-09-07T07:46:33.5206735Z * [new branch] gh/IvanKobzarev/127/orig -> origin/gh/IvanKobzarev/127/orig 2025-09-07T07:46:33.5207354Z * [new branch] gh/IvanKobzarev/128/base -> origin/gh/IvanKobzarev/128/base 2025-09-07T07:46:33.5207980Z * [new branch] gh/IvanKobzarev/128/head -> origin/gh/IvanKobzarev/128/head 2025-09-07T07:46:33.5208596Z * [new branch] gh/IvanKobzarev/128/orig -> origin/gh/IvanKobzarev/128/orig 2025-09-07T07:46:33.5209204Z * [new branch] gh/IvanKobzarev/132/base -> origin/gh/IvanKobzarev/132/base 2025-09-07T07:46:33.5209817Z * [new branch] gh/IvanKobzarev/132/head -> origin/gh/IvanKobzarev/132/head 2025-09-07T07:46:33.5210431Z * [new branch] gh/IvanKobzarev/132/orig -> origin/gh/IvanKobzarev/132/orig 2025-09-07T07:46:33.5211047Z * [new branch] gh/IvanKobzarev/133/base -> origin/gh/IvanKobzarev/133/base 2025-09-07T07:46:33.5211672Z * [new branch] gh/IvanKobzarev/133/head -> origin/gh/IvanKobzarev/133/head 2025-09-07T07:46:33.5212277Z * [new branch] gh/IvanKobzarev/133/orig -> origin/gh/IvanKobzarev/133/orig 2025-09-07T07:46:33.5213024Z * [new branch] gh/IvanKobzarev/134/base -> origin/gh/IvanKobzarev/134/base 2025-09-07T07:46:33.5213648Z * [new branch] gh/IvanKobzarev/134/head -> origin/gh/IvanKobzarev/134/head 2025-09-07T07:46:33.5214261Z * [new branch] gh/IvanKobzarev/134/orig -> origin/gh/IvanKobzarev/134/orig 2025-09-07T07:46:33.5214877Z * [new branch] gh/IvanKobzarev/135/base -> origin/gh/IvanKobzarev/135/base 2025-09-07T07:46:33.5215477Z * [new branch] gh/IvanKobzarev/135/head -> origin/gh/IvanKobzarev/135/head 2025-09-07T07:46:33.5216095Z * [new branch] gh/IvanKobzarev/135/orig -> origin/gh/IvanKobzarev/135/orig 2025-09-07T07:46:33.5216715Z * [new branch] gh/IvanKobzarev/136/base -> origin/gh/IvanKobzarev/136/base 2025-09-07T07:46:33.5217437Z * [new branch] gh/IvanKobzarev/136/head -> origin/gh/IvanKobzarev/136/head 2025-09-07T07:46:33.5218063Z * [new branch] gh/IvanKobzarev/136/orig -> origin/gh/IvanKobzarev/136/orig 2025-09-07T07:46:33.5218668Z * [new branch] gh/IvanKobzarev/137/base -> origin/gh/IvanKobzarev/137/base 2025-09-07T07:46:33.5219289Z * [new branch] gh/IvanKobzarev/137/head -> origin/gh/IvanKobzarev/137/head 2025-09-07T07:46:33.5219906Z * [new branch] gh/IvanKobzarev/137/orig -> origin/gh/IvanKobzarev/137/orig 2025-09-07T07:46:33.5220520Z * [new branch] gh/IvanKobzarev/138/base -> origin/gh/IvanKobzarev/138/base 2025-09-07T07:46:33.5221149Z * [new branch] gh/IvanKobzarev/138/head -> origin/gh/IvanKobzarev/138/head 2025-09-07T07:46:33.5221760Z * [new branch] gh/IvanKobzarev/138/orig -> origin/gh/IvanKobzarev/138/orig 2025-09-07T07:46:33.5222379Z * [new branch] gh/IvanKobzarev/139/base -> origin/gh/IvanKobzarev/139/base 2025-09-07T07:46:33.5223111Z * [new branch] gh/IvanKobzarev/139/head -> origin/gh/IvanKobzarev/139/head 2025-09-07T07:46:33.5223737Z * [new branch] gh/IvanKobzarev/139/orig -> origin/gh/IvanKobzarev/139/orig 2025-09-07T07:46:33.5224361Z * [new branch] gh/IvanKobzarev/140/base -> origin/gh/IvanKobzarev/140/base 2025-09-07T07:46:33.5224975Z * [new branch] gh/IvanKobzarev/140/head -> origin/gh/IvanKobzarev/140/head 2025-09-07T07:46:33.5225605Z * [new branch] gh/IvanKobzarev/140/orig -> origin/gh/IvanKobzarev/140/orig 2025-09-07T07:46:33.5226231Z * [new branch] gh/IvanKobzarev/141/base -> origin/gh/IvanKobzarev/141/base 2025-09-07T07:46:33.5226858Z * [new branch] gh/IvanKobzarev/141/head -> origin/gh/IvanKobzarev/141/head 2025-09-07T07:46:33.5227487Z * [new branch] gh/IvanKobzarev/141/orig -> origin/gh/IvanKobzarev/141/orig 2025-09-07T07:46:33.5228108Z * [new branch] gh/IvanKobzarev/142/base -> origin/gh/IvanKobzarev/142/base 2025-09-07T07:46:33.5228733Z * [new branch] gh/IvanKobzarev/142/head -> origin/gh/IvanKobzarev/142/head 2025-09-07T07:46:33.5229366Z * [new branch] gh/IvanKobzarev/142/orig -> origin/gh/IvanKobzarev/142/orig 2025-09-07T07:46:33.5229995Z * [new branch] gh/IvanKobzarev/143/base -> origin/gh/IvanKobzarev/143/base 2025-09-07T07:46:33.5230610Z * [new branch] gh/IvanKobzarev/143/head -> origin/gh/IvanKobzarev/143/head 2025-09-07T07:46:33.5231237Z * [new branch] gh/IvanKobzarev/143/orig -> origin/gh/IvanKobzarev/143/orig 2025-09-07T07:46:33.5231867Z * [new branch] gh/IvanKobzarev/144/base -> origin/gh/IvanKobzarev/144/base 2025-09-07T07:46:33.5232493Z * [new branch] gh/IvanKobzarev/144/head -> origin/gh/IvanKobzarev/144/head 2025-09-07T07:46:33.5233122Z * [new branch] gh/IvanKobzarev/144/orig -> origin/gh/IvanKobzarev/144/orig 2025-09-07T07:46:33.5233743Z * [new branch] gh/IvanKobzarev/145/base -> origin/gh/IvanKobzarev/145/base 2025-09-07T07:46:33.5234459Z * [new branch] gh/IvanKobzarev/145/head -> origin/gh/IvanKobzarev/145/head 2025-09-07T07:46:33.5235090Z * [new branch] gh/IvanKobzarev/145/orig -> origin/gh/IvanKobzarev/145/orig 2025-09-07T07:46:33.5235724Z * [new branch] gh/IvanKobzarev/146/base -> origin/gh/IvanKobzarev/146/base 2025-09-07T07:46:33.5236354Z * [new branch] gh/IvanKobzarev/146/head -> origin/gh/IvanKobzarev/146/head 2025-09-07T07:46:33.5236968Z * [new branch] gh/IvanKobzarev/146/orig -> origin/gh/IvanKobzarev/146/orig 2025-09-07T07:46:33.5237602Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-09-07T07:46:33.5238225Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-09-07T07:46:33.5238846Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-09-07T07:46:33.5239464Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-09-07T07:46:33.5240066Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-09-07T07:46:33.5240676Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-09-07T07:46:33.5241261Z * [new branch] gh/PaliC/1/base -> origin/gh/PaliC/1/base 2025-09-07T07:46:33.5241807Z * [new branch] gh/PaliC/1/head -> origin/gh/PaliC/1/head 2025-09-07T07:46:33.5242348Z * [new branch] gh/PaliC/1/orig -> origin/gh/PaliC/1/orig 2025-09-07T07:46:33.5243128Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-09-07T07:46:33.5243681Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-09-07T07:46:33.5244371Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-09-07T07:46:33.5244913Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-09-07T07:46:33.5245460Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-09-07T07:46:33.5245986Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-09-07T07:46:33.5246533Z * [new branch] gh/PaliC/2/base -> origin/gh/PaliC/2/base 2025-09-07T07:46:33.5247074Z * [new branch] gh/PaliC/2/head -> origin/gh/PaliC/2/head 2025-09-07T07:46:33.5247610Z * [new branch] gh/PaliC/2/orig -> origin/gh/PaliC/2/orig 2025-09-07T07:46:33.5248151Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-09-07T07:46:33.5248680Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-09-07T07:46:33.5249227Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-09-07T07:46:33.5249770Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-09-07T07:46:33.5250316Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-09-07T07:46:33.5250846Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-09-07T07:46:33.5251394Z * [new branch] gh/PaliC/22/base -> origin/gh/PaliC/22/base 2025-09-07T07:46:33.5251937Z * [new branch] gh/PaliC/22/head -> origin/gh/PaliC/22/head 2025-09-07T07:46:33.5252481Z * [new branch] gh/PaliC/22/orig -> origin/gh/PaliC/22/orig 2025-09-07T07:46:33.5253022Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-09-07T07:46:33.5253555Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-09-07T07:46:33.5254105Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-09-07T07:46:33.5254647Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-09-07T07:46:33.5255302Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-09-07T07:46:33.5255856Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-09-07T07:46:33.5256435Z * [new branch] gh/PaulZhang12/17/base -> origin/gh/PaulZhang12/17/base 2025-09-07T07:46:33.5257052Z * [new branch] gh/PaulZhang12/17/head -> origin/gh/PaulZhang12/17/head 2025-09-07T07:46:33.5257740Z * [new branch] gh/PaulZhang12/20/base -> origin/gh/PaulZhang12/20/base 2025-09-07T07:46:33.5258349Z * [new branch] gh/PaulZhang12/20/head -> origin/gh/PaulZhang12/20/head 2025-09-07T07:46:33.5258957Z * [new branch] gh/PaulZhang12/20/orig -> origin/gh/PaulZhang12/20/orig 2025-09-07T07:46:33.5259553Z * [new branch] gh/PaulZhang12/21/base -> origin/gh/PaulZhang12/21/base 2025-09-07T07:46:33.5260165Z * [new branch] gh/PaulZhang12/21/head -> origin/gh/PaulZhang12/21/head 2025-09-07T07:46:33.5260766Z * [new branch] gh/PaulZhang12/21/orig -> origin/gh/PaulZhang12/21/orig 2025-09-07T07:46:33.5261367Z * [new branch] gh/PaulZhang12/22/base -> origin/gh/PaulZhang12/22/base 2025-09-07T07:46:33.5261977Z * [new branch] gh/PaulZhang12/22/head -> origin/gh/PaulZhang12/22/head 2025-09-07T07:46:33.5262566Z * [new branch] gh/PaulZhang12/22/orig -> origin/gh/PaulZhang12/22/orig 2025-09-07T07:46:33.5263169Z * [new branch] gh/PaulZhang12/23/base -> origin/gh/PaulZhang12/23/base 2025-09-07T07:46:33.5263768Z * [new branch] gh/PaulZhang12/23/head -> origin/gh/PaulZhang12/23/head 2025-09-07T07:46:33.5264375Z * [new branch] gh/PaulZhang12/23/orig -> origin/gh/PaulZhang12/23/orig 2025-09-07T07:46:33.5265104Z * [new branch] gh/PaulZhang12/24/base -> origin/gh/PaulZhang12/24/base 2025-09-07T07:46:33.5265700Z * [new branch] gh/PaulZhang12/24/head -> origin/gh/PaulZhang12/24/head 2025-09-07T07:46:33.5266312Z * [new branch] gh/PaulZhang12/24/orig -> origin/gh/PaulZhang12/24/orig 2025-09-07T07:46:33.5266916Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-09-07T07:46:33.5267521Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-09-07T07:46:33.5268122Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-09-07T07:46:33.5268713Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-09-07T07:46:33.5269323Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-09-07T07:46:33.5269959Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-09-07T07:46:33.5270604Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-09-07T07:46:33.5271238Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-09-07T07:46:33.5271882Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-09-07T07:46:33.5272501Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-09-07T07:46:33.5273093Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-09-07T07:46:33.5273697Z * [new branch] gh/StrongerXi/133/base -> origin/gh/StrongerXi/133/base 2025-09-07T07:46:33.5274292Z * [new branch] gh/StrongerXi/133/head -> origin/gh/StrongerXi/133/head 2025-09-07T07:46:33.5274895Z * [new branch] gh/StrongerXi/133/orig -> origin/gh/StrongerXi/133/orig 2025-09-07T07:46:33.5275499Z * [new branch] gh/StrongerXi/134/base -> origin/gh/StrongerXi/134/base 2025-09-07T07:46:33.5276208Z * [new branch] gh/StrongerXi/134/head -> origin/gh/StrongerXi/134/head 2025-09-07T07:46:33.5276815Z * [new branch] gh/StrongerXi/134/orig -> origin/gh/StrongerXi/134/orig 2025-09-07T07:46:33.5277404Z * [new branch] gh/StrongerXi/136/base -> origin/gh/StrongerXi/136/base 2025-09-07T07:46:33.5278913Z * [new branch] gh/StrongerXi/136/head -> origin/gh/StrongerXi/136/head 2025-09-07T07:46:33.5279535Z * [new branch] gh/StrongerXi/136/orig -> origin/gh/StrongerXi/136/orig 2025-09-07T07:46:33.5280137Z * [new branch] gh/StrongerXi/137/base -> origin/gh/StrongerXi/137/base 2025-09-07T07:46:33.5280744Z * [new branch] gh/StrongerXi/137/head -> origin/gh/StrongerXi/137/head 2025-09-07T07:46:33.5281342Z * [new branch] gh/StrongerXi/137/orig -> origin/gh/StrongerXi/137/orig 2025-09-07T07:46:33.5281955Z * [new branch] gh/StrongerXi/138/base -> origin/gh/StrongerXi/138/base 2025-09-07T07:46:33.5282561Z * [new branch] gh/StrongerXi/138/head -> origin/gh/StrongerXi/138/head 2025-09-07T07:46:33.5283355Z * [new branch] gh/StrongerXi/138/orig -> origin/gh/StrongerXi/138/orig 2025-09-07T07:46:33.5283963Z * [new branch] gh/StrongerXi/139/base -> origin/gh/StrongerXi/139/base 2025-09-07T07:46:33.5284561Z * [new branch] gh/StrongerXi/139/head -> origin/gh/StrongerXi/139/head 2025-09-07T07:46:33.5285164Z * [new branch] gh/StrongerXi/139/orig -> origin/gh/StrongerXi/139/orig 2025-09-07T07:46:33.5285767Z * [new branch] gh/StrongerXi/140/base -> origin/gh/StrongerXi/140/base 2025-09-07T07:46:33.5286373Z * [new branch] gh/StrongerXi/140/head -> origin/gh/StrongerXi/140/head 2025-09-07T07:46:33.5287144Z * [new branch] gh/StrongerXi/140/orig -> origin/gh/StrongerXi/140/orig 2025-09-07T07:46:33.5287732Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-09-07T07:46:33.5288328Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-09-07T07:46:33.5288919Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-09-07T07:46:33.5289515Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-09-07T07:46:33.5290095Z * [new branch] gh/XilunWu/133/base -> origin/gh/XilunWu/133/base 2025-09-07T07:46:33.5290654Z * [new branch] gh/XilunWu/133/head -> origin/gh/XilunWu/133/head 2025-09-07T07:46:33.5291227Z * [new branch] gh/XilunWu/133/orig -> origin/gh/XilunWu/133/orig 2025-09-07T07:46:33.5291789Z * [new branch] gh/XilunWu/139/base -> origin/gh/XilunWu/139/base 2025-09-07T07:46:33.5292359Z * [new branch] gh/XilunWu/139/head -> origin/gh/XilunWu/139/head 2025-09-07T07:46:33.5292920Z * [new branch] gh/XilunWu/139/orig -> origin/gh/XilunWu/139/orig 2025-09-07T07:46:33.5293470Z * [new branch] gh/XilunWu/143/base -> origin/gh/XilunWu/143/base 2025-09-07T07:46:33.5294027Z * [new branch] gh/XilunWu/143/head -> origin/gh/XilunWu/143/head 2025-09-07T07:46:33.5294588Z * [new branch] gh/XilunWu/143/orig -> origin/gh/XilunWu/143/orig 2025-09-07T07:46:33.5295156Z * [new branch] gh/XilunWu/144/base -> origin/gh/XilunWu/144/base 2025-09-07T07:46:33.5295719Z * [new branch] gh/XilunWu/144/head -> origin/gh/XilunWu/144/head 2025-09-07T07:46:33.5296269Z * [new branch] gh/XilunWu/144/orig -> origin/gh/XilunWu/144/orig 2025-09-07T07:46:33.5296833Z * [new branch] gh/XilunWu/145/base -> origin/gh/XilunWu/145/base 2025-09-07T07:46:33.5297483Z * [new branch] gh/XilunWu/145/head -> origin/gh/XilunWu/145/head 2025-09-07T07:46:33.5298171Z * [new branch] gh/XilunWu/145/orig -> origin/gh/XilunWu/145/orig 2025-09-07T07:46:33.5298737Z * [new branch] gh/XilunWu/146/base -> origin/gh/XilunWu/146/base 2025-09-07T07:46:33.5299286Z * [new branch] gh/XilunWu/146/head -> origin/gh/XilunWu/146/head 2025-09-07T07:46:33.5299851Z * [new branch] gh/XilunWu/146/orig -> origin/gh/XilunWu/146/orig 2025-09-07T07:46:33.5300417Z * [new branch] gh/XilunWu/147/base -> origin/gh/XilunWu/147/base 2025-09-07T07:46:33.5300981Z * [new branch] gh/XilunWu/147/head -> origin/gh/XilunWu/147/head 2025-09-07T07:46:33.5301533Z * [new branch] gh/XilunWu/147/orig -> origin/gh/XilunWu/147/orig 2025-09-07T07:46:33.5302102Z * [new branch] gh/XilunWu/148/base -> origin/gh/XilunWu/148/base 2025-09-07T07:46:33.5302670Z * [new branch] gh/XilunWu/148/head -> origin/gh/XilunWu/148/head 2025-09-07T07:46:33.5303238Z * [new branch] gh/XilunWu/148/orig -> origin/gh/XilunWu/148/orig 2025-09-07T07:46:33.5303800Z * [new branch] gh/XilunWu/149/base -> origin/gh/XilunWu/149/base 2025-09-07T07:46:33.5304356Z * [new branch] gh/XilunWu/149/head -> origin/gh/XilunWu/149/head 2025-09-07T07:46:33.5304921Z * [new branch] gh/XilunWu/149/orig -> origin/gh/XilunWu/149/orig 2025-09-07T07:46:33.5305488Z * [new branch] gh/XilunWu/150/base -> origin/gh/XilunWu/150/base 2025-09-07T07:46:33.5306056Z * [new branch] gh/XilunWu/150/head -> origin/gh/XilunWu/150/head 2025-09-07T07:46:33.5306620Z * [new branch] gh/XilunWu/150/orig -> origin/gh/XilunWu/150/orig 2025-09-07T07:46:33.5307267Z * [new branch] gh/XilunWu/151/base -> origin/gh/XilunWu/151/base 2025-09-07T07:46:33.5307839Z * [new branch] gh/XilunWu/151/head -> origin/gh/XilunWu/151/head 2025-09-07T07:46:33.5308412Z * [new branch] gh/XilunWu/151/orig -> origin/gh/XilunWu/151/orig 2025-09-07T07:46:33.5308980Z * [new branch] gh/XilunWu/152/base -> origin/gh/XilunWu/152/base 2025-09-07T07:46:33.5309546Z * [new branch] gh/XilunWu/152/head -> origin/gh/XilunWu/152/head 2025-09-07T07:46:33.5310096Z * [new branch] gh/XilunWu/152/orig -> origin/gh/XilunWu/152/orig 2025-09-07T07:46:33.5310662Z * [new branch] gh/XilunWu/153/base -> origin/gh/XilunWu/153/base 2025-09-07T07:46:33.5311224Z * [new branch] gh/XilunWu/153/head -> origin/gh/XilunWu/153/head 2025-09-07T07:46:33.5311786Z * [new branch] gh/XilunWu/153/orig -> origin/gh/XilunWu/153/orig 2025-09-07T07:46:33.5312359Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-09-07T07:46:33.5312914Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-09-07T07:46:33.5313473Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-09-07T07:46:33.5314038Z * [new branch] gh/XilunWu/161/base -> origin/gh/XilunWu/161/base 2025-09-07T07:46:33.5314601Z * [new branch] gh/XilunWu/161/head -> origin/gh/XilunWu/161/head 2025-09-07T07:46:33.5315164Z * [new branch] gh/XilunWu/161/orig -> origin/gh/XilunWu/161/orig 2025-09-07T07:46:33.5315715Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-09-07T07:46:33.5316280Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-09-07T07:46:33.5316840Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-09-07T07:46:33.5317407Z * [new branch] gh/XilunWu/164/base -> origin/gh/XilunWu/164/base 2025-09-07T07:46:33.5318060Z * [new branch] gh/XilunWu/164/head -> origin/gh/XilunWu/164/head 2025-09-07T07:46:33.5318615Z * [new branch] gh/XilunWu/164/orig -> origin/gh/XilunWu/164/orig 2025-09-07T07:46:33.5319179Z * [new branch] gh/XilunWu/165/base -> origin/gh/XilunWu/165/base 2025-09-07T07:46:33.5319752Z * [new branch] gh/XilunWu/165/head -> origin/gh/XilunWu/165/head 2025-09-07T07:46:33.5320320Z * [new branch] gh/XilunWu/165/orig -> origin/gh/XilunWu/165/orig 2025-09-07T07:46:33.5320873Z * [new branch] gh/XilunWu/166/base -> origin/gh/XilunWu/166/base 2025-09-07T07:46:33.5321438Z * [new branch] gh/XilunWu/166/head -> origin/gh/XilunWu/166/head 2025-09-07T07:46:33.5322006Z * [new branch] gh/XilunWu/166/orig -> origin/gh/XilunWu/166/orig 2025-09-07T07:46:33.5322575Z * [new branch] gh/XilunWu/167/base -> origin/gh/XilunWu/167/base 2025-09-07T07:46:33.5323330Z * [new branch] gh/XilunWu/167/head -> origin/gh/XilunWu/167/head 2025-09-07T07:46:33.5323886Z * [new branch] gh/XilunWu/167/orig -> origin/gh/XilunWu/167/orig 2025-09-07T07:46:33.5324462Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-09-07T07:46:33.5325028Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-09-07T07:46:33.5325591Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-09-07T07:46:33.5326154Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-09-07T07:46:33.5326705Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-09-07T07:46:33.5327272Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-09-07T07:46:33.5327982Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-09-07T07:46:33.5328554Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-09-07T07:46:33.5329124Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-09-07T07:46:33.5329696Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-09-07T07:46:33.5330286Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-09-07T07:46:33.5330877Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-09-07T07:46:33.5331469Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-09-07T07:46:33.5332065Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-09-07T07:46:33.5332648Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-09-07T07:46:33.5333243Z * [new branch] gh/XuehaiPan/189/base -> origin/gh/XuehaiPan/189/base 2025-09-07T07:46:33.5333844Z * [new branch] gh/XuehaiPan/189/head -> origin/gh/XuehaiPan/189/head 2025-09-07T07:46:33.5334430Z * [new branch] gh/XuehaiPan/189/orig -> origin/gh/XuehaiPan/189/orig 2025-09-07T07:46:33.5335023Z * [new branch] gh/XuehaiPan/232/base -> origin/gh/XuehaiPan/232/base 2025-09-07T07:46:33.5335604Z * [new branch] gh/XuehaiPan/232/head -> origin/gh/XuehaiPan/232/head 2025-09-07T07:46:33.5336196Z * [new branch] gh/XuehaiPan/232/orig -> origin/gh/XuehaiPan/232/orig 2025-09-07T07:46:33.5336785Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-09-07T07:46:33.5337445Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-09-07T07:46:33.5338033Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-09-07T07:46:33.5338850Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-09-07T07:46:33.5339451Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-09-07T07:46:33.5340046Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-09-07T07:46:33.5340639Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-09-07T07:46:33.5341219Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-09-07T07:46:33.5341816Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-09-07T07:46:33.5342414Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-09-07T07:46:33.5343006Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-09-07T07:46:33.5343600Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-09-07T07:46:33.5344178Z * [new branch] gh/XuehaiPan/257/base -> origin/gh/XuehaiPan/257/base 2025-09-07T07:46:33.5344764Z * [new branch] gh/XuehaiPan/257/head -> origin/gh/XuehaiPan/257/head 2025-09-07T07:46:33.5345355Z * [new branch] gh/XuehaiPan/257/orig -> origin/gh/XuehaiPan/257/orig 2025-09-07T07:46:33.5345942Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-09-07T07:46:33.5346533Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-09-07T07:46:33.5347112Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-09-07T07:46:33.5347700Z * [new branch] gh/XuehaiPan/290/base -> origin/gh/XuehaiPan/290/base 2025-09-07T07:46:33.5348411Z * [new branch] gh/XuehaiPan/290/head -> origin/gh/XuehaiPan/290/head 2025-09-07T07:46:33.5349005Z * [new branch] gh/XuehaiPan/290/orig -> origin/gh/XuehaiPan/290/orig 2025-09-07T07:46:33.5349597Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-09-07T07:46:33.5350178Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-09-07T07:46:33.5350772Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-09-07T07:46:33.5351364Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-09-07T07:46:33.5351958Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-09-07T07:46:33.5352550Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-09-07T07:46:33.5353134Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-09-07T07:46:33.5353731Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-09-07T07:46:33.5354325Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-09-07T07:46:33.5354919Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-09-07T07:46:33.5355509Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-09-07T07:46:33.5356089Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-09-07T07:46:33.5356683Z * [new branch] gh/XuehaiPan/356/base -> origin/gh/XuehaiPan/356/base 2025-09-07T07:46:33.5357276Z * [new branch] gh/XuehaiPan/356/head -> origin/gh/XuehaiPan/356/head 2025-09-07T07:46:33.5357870Z * [new branch] gh/XuehaiPan/356/orig -> origin/gh/XuehaiPan/356/orig 2025-09-07T07:46:33.5358448Z * [new branch] gh/XuehaiPan/357/base -> origin/gh/XuehaiPan/357/base 2025-09-07T07:46:33.5359051Z * [new branch] gh/XuehaiPan/357/head -> origin/gh/XuehaiPan/357/head 2025-09-07T07:46:33.5359737Z * [new branch] gh/XuehaiPan/357/orig -> origin/gh/XuehaiPan/357/orig 2025-09-07T07:46:33.5360338Z * [new branch] gh/XuehaiPan/358/base -> origin/gh/XuehaiPan/358/base 2025-09-07T07:46:33.5360933Z * [new branch] gh/XuehaiPan/358/head -> origin/gh/XuehaiPan/358/head 2025-09-07T07:46:33.5361513Z * [new branch] gh/XuehaiPan/358/orig -> origin/gh/XuehaiPan/358/orig 2025-09-07T07:46:33.5362106Z * [new branch] gh/XuehaiPan/359/base -> origin/gh/XuehaiPan/359/base 2025-09-07T07:46:33.5362695Z * [new branch] gh/XuehaiPan/359/head -> origin/gh/XuehaiPan/359/head 2025-09-07T07:46:33.5363457Z * [new branch] gh/XuehaiPan/359/orig -> origin/gh/XuehaiPan/359/orig 2025-09-07T07:46:33.5364055Z * [new branch] gh/XuehaiPan/360/base -> origin/gh/XuehaiPan/360/base 2025-09-07T07:46:33.5364635Z * [new branch] gh/XuehaiPan/360/head -> origin/gh/XuehaiPan/360/head 2025-09-07T07:46:33.5365228Z * [new branch] gh/XuehaiPan/360/orig -> origin/gh/XuehaiPan/360/orig 2025-09-07T07:46:33.5365819Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-09-07T07:46:33.5366408Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-09-07T07:46:33.5366997Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-09-07T07:46:33.5367575Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-09-07T07:46:33.5368168Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-09-07T07:46:33.5368759Z * [new branch] gh/XuehaiPan/369/base -> origin/gh/XuehaiPan/369/base 2025-09-07T07:46:33.5369480Z * [new branch] gh/XuehaiPan/369/head -> origin/gh/XuehaiPan/369/head 2025-09-07T07:46:33.5370070Z * [new branch] gh/XuehaiPan/369/orig -> origin/gh/XuehaiPan/369/orig 2025-09-07T07:46:33.5370647Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-09-07T07:46:33.5371237Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-09-07T07:46:33.5371828Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-09-07T07:46:33.5372419Z * [new branch] gh/XuehaiPan/380/base -> origin/gh/XuehaiPan/380/base 2025-09-07T07:46:33.5373011Z * [new branch] gh/XuehaiPan/380/head -> origin/gh/XuehaiPan/380/head 2025-09-07T07:46:33.5373592Z * [new branch] gh/XuehaiPan/380/orig -> origin/gh/XuehaiPan/380/orig 2025-09-07T07:46:33.5374179Z * [new branch] gh/XuehaiPan/381/base -> origin/gh/XuehaiPan/381/base 2025-09-07T07:46:33.5374770Z * [new branch] gh/XuehaiPan/381/head -> origin/gh/XuehaiPan/381/head 2025-09-07T07:46:33.5375372Z * [new branch] gh/XuehaiPan/382/base -> origin/gh/XuehaiPan/382/base 2025-09-07T07:46:33.5375968Z * [new branch] gh/XuehaiPan/382/head -> origin/gh/XuehaiPan/382/head 2025-09-07T07:46:33.5376551Z * [new branch] gh/XuehaiPan/382/orig -> origin/gh/XuehaiPan/382/orig 2025-09-07T07:46:33.5377145Z * [new branch] gh/XuehaiPan/383/base -> origin/gh/XuehaiPan/383/base 2025-09-07T07:46:33.5377838Z * [new branch] gh/XuehaiPan/383/head -> origin/gh/XuehaiPan/383/head 2025-09-07T07:46:33.5378432Z * [new branch] gh/XuehaiPan/383/orig -> origin/gh/XuehaiPan/383/orig 2025-09-07T07:46:33.5379010Z * [new branch] gh/XuehaiPan/384/base -> origin/gh/XuehaiPan/384/base 2025-09-07T07:46:33.5379608Z * [new branch] gh/XuehaiPan/384/head -> origin/gh/XuehaiPan/384/head 2025-09-07T07:46:33.5380196Z * [new branch] gh/XuehaiPan/384/orig -> origin/gh/XuehaiPan/384/orig 2025-09-07T07:46:33.5380936Z * [new branch] gh/XuehaiPan/385/base -> origin/gh/XuehaiPan/385/base 2025-09-07T07:46:33.5381537Z * [new branch] gh/XuehaiPan/385/head -> origin/gh/XuehaiPan/385/head 2025-09-07T07:46:33.5382123Z * [new branch] gh/XuehaiPan/385/orig -> origin/gh/XuehaiPan/385/orig 2025-09-07T07:46:33.5382723Z * [new branch] gh/XuehaiPan/386/base -> origin/gh/XuehaiPan/386/base 2025-09-07T07:46:33.5383315Z * [new branch] gh/XuehaiPan/386/head -> origin/gh/XuehaiPan/386/head 2025-09-07T07:46:33.5383904Z * [new branch] gh/XuehaiPan/386/orig -> origin/gh/XuehaiPan/386/orig 2025-09-07T07:46:33.5384501Z * [new branch] gh/XuehaiPan/387/base -> origin/gh/XuehaiPan/387/base 2025-09-07T07:46:33.5385084Z * [new branch] gh/XuehaiPan/387/head -> origin/gh/XuehaiPan/387/head 2025-09-07T07:46:33.5385681Z * [new branch] gh/XuehaiPan/387/orig -> origin/gh/XuehaiPan/387/orig 2025-09-07T07:46:33.5386272Z * [new branch] gh/ZainRizvi/1/base -> origin/gh/ZainRizvi/1/base 2025-09-07T07:46:33.5386849Z * [new branch] gh/ZainRizvi/1/head -> origin/gh/ZainRizvi/1/head 2025-09-07T07:46:33.5387421Z * [new branch] gh/ZainRizvi/2/base -> origin/gh/ZainRizvi/2/base 2025-09-07T07:46:33.5387980Z * [new branch] gh/ZainRizvi/2/head -> origin/gh/ZainRizvi/2/head 2025-09-07T07:46:33.5388550Z * [new branch] gh/ZainRizvi/3/base -> origin/gh/ZainRizvi/3/base 2025-09-07T07:46:33.5389124Z * [new branch] gh/ZainRizvi/3/head -> origin/gh/ZainRizvi/3/head 2025-09-07T07:46:33.5389700Z * [new branch] gh/ZainRizvi/4/base -> origin/gh/ZainRizvi/4/base 2025-09-07T07:46:33.5390386Z * [new branch] gh/ZainRizvi/4/head -> origin/gh/ZainRizvi/4/head 2025-09-07T07:46:33.5390950Z * [new branch] gh/ZainRizvi/5/base -> origin/gh/ZainRizvi/5/base 2025-09-07T07:46:33.5391526Z * [new branch] gh/ZainRizvi/5/head -> origin/gh/ZainRizvi/5/head 2025-09-07T07:46:33.5392101Z * [new branch] gh/ZainRizvi/6/base -> origin/gh/ZainRizvi/6/base 2025-09-07T07:46:33.5392678Z * [new branch] gh/ZainRizvi/6/head -> origin/gh/ZainRizvi/6/head 2025-09-07T07:46:33.5393251Z * [new branch] gh/ZainRizvi/6/orig -> origin/gh/ZainRizvi/6/orig 2025-09-07T07:46:33.5393812Z * [new branch] gh/ZainRizvi/7/base -> origin/gh/ZainRizvi/7/base 2025-09-07T07:46:33.5394378Z * [new branch] gh/ZainRizvi/7/head -> origin/gh/ZainRizvi/7/head 2025-09-07T07:46:33.5394957Z * [new branch] gh/ZainRizvi/7/orig -> origin/gh/ZainRizvi/7/orig 2025-09-07T07:46:33.5395534Z * [new branch] gh/ZainRizvi/8/base -> origin/gh/ZainRizvi/8/base 2025-09-07T07:46:33.5396104Z * [new branch] gh/ZainRizvi/8/head -> origin/gh/ZainRizvi/8/head 2025-09-07T07:46:33.5396661Z * [new branch] gh/ZainRizvi/9/base -> origin/gh/ZainRizvi/9/base 2025-09-07T07:46:33.5397231Z * [new branch] gh/ZainRizvi/9/head -> origin/gh/ZainRizvi/9/head 2025-09-07T07:46:33.5397802Z * [new branch] gh/ZainRizvi/9/orig -> origin/gh/ZainRizvi/9/orig 2025-09-07T07:46:33.5398397Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-09-07T07:46:33.5398993Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-09-07T07:46:33.5399594Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-09-07T07:46:33.5400202Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-09-07T07:46:33.5400803Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-09-07T07:46:33.5401523Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-09-07T07:46:33.5402117Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-09-07T07:46:33.5402720Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-09-07T07:46:33.5403485Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-09-07T07:46:33.5404090Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-09-07T07:46:33.5404695Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-09-07T07:46:33.5405279Z * [new branch] gh/ZhiweiYan-96/64/base -> origin/gh/ZhiweiYan-96/64/base 2025-09-07T07:46:33.5405886Z * [new branch] gh/ZhiweiYan-96/64/head -> origin/gh/ZhiweiYan-96/64/head 2025-09-07T07:46:33.5406493Z * [new branch] gh/ZhiweiYan-96/64/orig -> origin/gh/ZhiweiYan-96/64/orig 2025-09-07T07:46:33.5407092Z * [new branch] gh/ZhiweiYan-96/65/base -> origin/gh/ZhiweiYan-96/65/base 2025-09-07T07:46:33.5407696Z * [new branch] gh/ZhiweiYan-96/65/head -> origin/gh/ZhiweiYan-96/65/head 2025-09-07T07:46:33.5408285Z * [new branch] gh/ZhiweiYan-96/65/orig -> origin/gh/ZhiweiYan-96/65/orig 2025-09-07T07:46:33.5408886Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-09-07T07:46:33.5409489Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-09-07T07:46:33.5410095Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-09-07T07:46:33.5410824Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-09-07T07:46:33.5411414Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-09-07T07:46:33.5412022Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-09-07T07:46:33.5412625Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-09-07T07:46:33.5413216Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-09-07T07:46:33.5413788Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-09-07T07:46:33.5414346Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-09-07T07:46:33.5414917Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-09-07T07:46:33.5415503Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-09-07T07:46:33.5416086Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-09-07T07:46:33.5416656Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-09-07T07:46:33.5417540Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-09-07T07:46:33.5418382Z * [new branch] gh/alexsamardzic/9/base -> origin/gh/alexsamardzic/9/base 2025-09-07T07:46:33.5419066Z * [new branch] gh/alexsamardzic/9/head -> origin/gh/alexsamardzic/9/head 2025-09-07T07:46:33.5419689Z * [new branch] gh/alexsamardzic/9/orig -> origin/gh/alexsamardzic/9/orig 2025-09-07T07:46:33.5420279Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-09-07T07:46:33.5420827Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-09-07T07:46:33.5421384Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-09-07T07:46:33.5421966Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-09-07T07:46:33.5422721Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-09-07T07:46:33.5423313Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-09-07T07:46:33.5423889Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-09-07T07:46:33.5424480Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-09-07T07:46:33.5425068Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-09-07T07:46:33.5425653Z * [new branch] gh/andrewor14/51/base -> origin/gh/andrewor14/51/base 2025-09-07T07:46:33.5426262Z * [new branch] gh/andrewor14/51/orig -> origin/gh/andrewor14/51/orig 2025-09-07T07:46:33.5426857Z * [new branch] gh/andyanwang/1/base -> origin/gh/andyanwang/1/base 2025-09-07T07:46:33.5427444Z * [new branch] gh/andyanwang/1/head -> origin/gh/andyanwang/1/head 2025-09-07T07:46:33.5428036Z * [new branch] gh/andyanwang/1/orig -> origin/gh/andyanwang/1/orig 2025-09-07T07:46:33.5428634Z * [new branch] gh/andyanwang/13/base -> origin/gh/andyanwang/13/base 2025-09-07T07:46:33.5429213Z * [new branch] gh/andyanwang/13/head -> origin/gh/andyanwang/13/head 2025-09-07T07:46:33.5429807Z * [new branch] gh/andyanwang/13/orig -> origin/gh/andyanwang/13/orig 2025-09-07T07:46:33.5430397Z * [new branch] gh/andyanwang/2/base -> origin/gh/andyanwang/2/base 2025-09-07T07:46:33.5430979Z * [new branch] gh/andyanwang/2/head -> origin/gh/andyanwang/2/head 2025-09-07T07:46:33.5471390Z * [new branch] gh/andyanwang/2/orig -> origin/gh/andyanwang/2/orig 2025-09-07T07:46:33.5472592Z * [new branch] gh/andyanwang/28/base -> origin/gh/andyanwang/28/base 2025-09-07T07:46:33.5473221Z * [new branch] gh/andyanwang/28/head -> origin/gh/andyanwang/28/head 2025-09-07T07:46:33.5473816Z * [new branch] gh/andyanwang/28/orig -> origin/gh/andyanwang/28/orig 2025-09-07T07:46:33.5474411Z * [new branch] gh/andyanwang/3/base -> origin/gh/andyanwang/3/base 2025-09-07T07:46:33.5474984Z * [new branch] gh/andyanwang/3/head -> origin/gh/andyanwang/3/head 2025-09-07T07:46:33.5475573Z * [new branch] gh/andyanwang/3/orig -> origin/gh/andyanwang/3/orig 2025-09-07T07:46:33.5476161Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-09-07T07:46:33.5476747Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-09-07T07:46:33.5477330Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-09-07T07:46:33.5477911Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-09-07T07:46:33.5478496Z * [new branch] gh/andyanwang/32/base -> origin/gh/andyanwang/32/base 2025-09-07T07:46:33.5479125Z * [new branch] gh/andyanwang/32/head -> origin/gh/andyanwang/32/head 2025-09-07T07:46:33.5479716Z * [new branch] gh/andyanwang/32/orig -> origin/gh/andyanwang/32/orig 2025-09-07T07:46:33.5480304Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-09-07T07:46:33.5480878Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-09-07T07:46:33.5481469Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-09-07T07:46:33.5482058Z * [new branch] gh/andyanwang/4/base -> origin/gh/andyanwang/4/base 2025-09-07T07:46:33.5482650Z * [new branch] gh/andyanwang/4/head -> origin/gh/andyanwang/4/head 2025-09-07T07:46:33.5483445Z * [new branch] gh/andyanwang/4/orig -> origin/gh/andyanwang/4/orig 2025-09-07T07:46:33.5484167Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-09-07T07:46:33.5484751Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-09-07T07:46:33.5485325Z * [new branch] gh/angelayi/111/base -> origin/gh/angelayi/111/base 2025-09-07T07:46:33.5485897Z * [new branch] gh/angelayi/111/head -> origin/gh/angelayi/111/head 2025-09-07T07:46:33.5486473Z * [new branch] gh/angelayi/111/orig -> origin/gh/angelayi/111/orig 2025-09-07T07:46:33.5487035Z * [new branch] gh/angelayi/112/base -> origin/gh/angelayi/112/base 2025-09-07T07:46:33.5487615Z * [new branch] gh/angelayi/112/head -> origin/gh/angelayi/112/head 2025-09-07T07:46:33.5488196Z * [new branch] gh/angelayi/112/orig -> origin/gh/angelayi/112/orig 2025-09-07T07:46:33.5488775Z * [new branch] gh/angelayi/113/base -> origin/gh/angelayi/113/base 2025-09-07T07:46:33.5489342Z * [new branch] gh/angelayi/113/head -> origin/gh/angelayi/113/head 2025-09-07T07:46:33.5489905Z * [new branch] gh/angelayi/113/orig -> origin/gh/angelayi/113/orig 2025-09-07T07:46:33.5490477Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-09-07T07:46:33.5491055Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-09-07T07:46:33.5491634Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-09-07T07:46:33.5492191Z * [new branch] gh/angelayi/115/base -> origin/gh/angelayi/115/base 2025-09-07T07:46:33.5492767Z * [new branch] gh/angelayi/115/head -> origin/gh/angelayi/115/head 2025-09-07T07:46:33.5493490Z * [new branch] gh/angelayi/115/orig -> origin/gh/angelayi/115/orig 2025-09-07T07:46:33.5494089Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-09-07T07:46:33.5494689Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-09-07T07:46:33.5495269Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-09-07T07:46:33.5495864Z * [new branch] gh/anijain2305/766/base -> origin/gh/anijain2305/766/base 2025-09-07T07:46:33.5496460Z * [new branch] gh/anijain2305/766/head -> origin/gh/anijain2305/766/head 2025-09-07T07:46:33.5497059Z * [new branch] gh/anijain2305/766/orig -> origin/gh/anijain2305/766/orig 2025-09-07T07:46:33.5497758Z * [new branch] gh/anijain2305/790/base -> origin/gh/anijain2305/790/base 2025-09-07T07:46:33.5498350Z * [new branch] gh/anijain2305/790/head -> origin/gh/anijain2305/790/head 2025-09-07T07:46:33.5498939Z * [new branch] gh/anijain2305/790/orig -> origin/gh/anijain2305/790/orig 2025-09-07T07:46:33.5499541Z * [new branch] gh/anijain2305/792/base -> origin/gh/anijain2305/792/base 2025-09-07T07:46:33.5500143Z * [new branch] gh/anijain2305/792/head -> origin/gh/anijain2305/792/head 2025-09-07T07:46:33.5500740Z * [new branch] gh/anijain2305/792/orig -> origin/gh/anijain2305/792/orig 2025-09-07T07:46:33.5501327Z * [new branch] gh/anijain2305/803/base -> origin/gh/anijain2305/803/base 2025-09-07T07:46:33.5501920Z * [new branch] gh/anijain2305/803/head -> origin/gh/anijain2305/803/head 2025-09-07T07:46:33.5502518Z * [new branch] gh/anijain2305/803/orig -> origin/gh/anijain2305/803/orig 2025-09-07T07:46:33.5503103Z * [new branch] gh/anijain2305/804/base -> origin/gh/anijain2305/804/base 2025-09-07T07:46:33.5503709Z * [new branch] gh/anijain2305/804/head -> origin/gh/anijain2305/804/head 2025-09-07T07:46:33.5504409Z * [new branch] gh/anijain2305/804/orig -> origin/gh/anijain2305/804/orig 2025-09-07T07:46:33.5505010Z * [new branch] gh/anijain2305/805/base -> origin/gh/anijain2305/805/base 2025-09-07T07:46:33.5505610Z * [new branch] gh/anijain2305/805/head -> origin/gh/anijain2305/805/head 2025-09-07T07:46:33.5506210Z * [new branch] gh/anijain2305/805/orig -> origin/gh/anijain2305/805/orig 2025-09-07T07:46:33.5506801Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-09-07T07:46:33.5507377Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-09-07T07:46:33.5507977Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-09-07T07:46:33.5508576Z * [new branch] gh/anijain2305/812/base -> origin/gh/anijain2305/812/base 2025-09-07T07:46:33.5509175Z * [new branch] gh/anijain2305/812/head -> origin/gh/anijain2305/812/head 2025-09-07T07:46:33.5509773Z * [new branch] gh/anijain2305/812/orig -> origin/gh/anijain2305/812/orig 2025-09-07T07:46:33.5510362Z * [new branch] gh/anijain2305/838/base -> origin/gh/anijain2305/838/base 2025-09-07T07:46:33.5510970Z * [new branch] gh/anijain2305/838/head -> origin/gh/anijain2305/838/head 2025-09-07T07:46:33.5511636Z * [new branch] gh/anijain2305/838/orig -> origin/gh/anijain2305/838/orig 2025-09-07T07:46:33.5512234Z * [new branch] gh/anijain2305/839/base -> origin/gh/anijain2305/839/base 2025-09-07T07:46:33.5512831Z * [new branch] gh/anijain2305/839/head -> origin/gh/anijain2305/839/head 2025-09-07T07:46:33.5513419Z * [new branch] gh/anijain2305/839/orig -> origin/gh/anijain2305/839/orig 2025-09-07T07:46:33.5514131Z * [new branch] gh/anijain2305/843/base -> origin/gh/anijain2305/843/base 2025-09-07T07:46:33.5514734Z * [new branch] gh/anijain2305/843/head -> origin/gh/anijain2305/843/head 2025-09-07T07:46:33.5515338Z * [new branch] gh/anijain2305/843/orig -> origin/gh/anijain2305/843/orig 2025-09-07T07:46:33.5515945Z * [new branch] gh/anijain2305/844/base -> origin/gh/anijain2305/844/base 2025-09-07T07:46:33.5516531Z * [new branch] gh/anijain2305/844/head -> origin/gh/anijain2305/844/head 2025-09-07T07:46:33.5517129Z * [new branch] gh/anijain2305/844/orig -> origin/gh/anijain2305/844/orig 2025-09-07T07:46:33.5517722Z * [new branch] gh/anijain2305/846/base -> origin/gh/anijain2305/846/base 2025-09-07T07:46:33.5518325Z * [new branch] gh/anijain2305/846/head -> origin/gh/anijain2305/846/head 2025-09-07T07:46:33.5518914Z * [new branch] gh/anijain2305/846/orig -> origin/gh/anijain2305/846/orig 2025-09-07T07:46:33.5519517Z * [new branch] gh/anijain2305/848/base -> origin/gh/anijain2305/848/base 2025-09-07T07:46:33.5520118Z * [new branch] gh/anijain2305/848/head -> origin/gh/anijain2305/848/head 2025-09-07T07:46:33.5520718Z * [new branch] gh/anijain2305/848/orig -> origin/gh/anijain2305/848/orig 2025-09-07T07:46:33.5521310Z * [new branch] gh/anijain2305/849/base -> origin/gh/anijain2305/849/base 2025-09-07T07:46:33.5521893Z * [new branch] gh/anijain2305/849/head -> origin/gh/anijain2305/849/head 2025-09-07T07:46:33.5522489Z * [new branch] gh/anijain2305/849/orig -> origin/gh/anijain2305/849/orig 2025-09-07T07:46:33.5523213Z * [new branch] gh/anijain2305/850/base -> origin/gh/anijain2305/850/base 2025-09-07T07:46:33.5523811Z * [new branch] gh/anijain2305/850/head -> origin/gh/anijain2305/850/head 2025-09-07T07:46:33.5524417Z * [new branch] gh/anijain2305/850/orig -> origin/gh/anijain2305/850/orig 2025-09-07T07:46:33.5525117Z * [new branch] gh/anijain2305/851/base -> origin/gh/anijain2305/851/base 2025-09-07T07:46:33.5525723Z * [new branch] gh/anijain2305/851/head -> origin/gh/anijain2305/851/head 2025-09-07T07:46:33.5526314Z * [new branch] gh/anijain2305/851/orig -> origin/gh/anijain2305/851/orig 2025-09-07T07:46:33.5526909Z * [new branch] gh/anijain2305/852/base -> origin/gh/anijain2305/852/base 2025-09-07T07:46:33.5527508Z * [new branch] gh/anijain2305/852/head -> origin/gh/anijain2305/852/head 2025-09-07T07:46:33.5528092Z * [new branch] gh/anijain2305/852/orig -> origin/gh/anijain2305/852/orig 2025-09-07T07:46:33.5528688Z * [new branch] gh/anijain2305/853/base -> origin/gh/anijain2305/853/base 2025-09-07T07:46:33.5529287Z * [new branch] gh/anijain2305/853/head -> origin/gh/anijain2305/853/head 2025-09-07T07:46:33.5529887Z * [new branch] gh/anijain2305/853/orig -> origin/gh/anijain2305/853/orig 2025-09-07T07:46:33.5530480Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-09-07T07:46:33.5531067Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-09-07T07:46:33.5531664Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-09-07T07:46:33.5532260Z * [new branch] gh/anijain2305/855/base -> origin/gh/anijain2305/855/base 2025-09-07T07:46:33.5532854Z * [new branch] gh/anijain2305/855/head -> origin/gh/anijain2305/855/head 2025-09-07T07:46:33.5533456Z * [new branch] gh/anijain2305/855/orig -> origin/gh/anijain2305/855/orig 2025-09-07T07:46:33.5534045Z * [new branch] gh/anijain2305/856/base -> origin/gh/anijain2305/856/base 2025-09-07T07:46:33.5534807Z * [new branch] gh/anijain2305/856/head -> origin/gh/anijain2305/856/head 2025-09-07T07:46:33.5535410Z * [new branch] gh/anijain2305/856/orig -> origin/gh/anijain2305/856/orig 2025-09-07T07:46:33.5536014Z * [new branch] gh/anijain2305/857/base -> origin/gh/anijain2305/857/base 2025-09-07T07:46:33.5536614Z * [new branch] gh/anijain2305/857/head -> origin/gh/anijain2305/857/head 2025-09-07T07:46:33.5537202Z * [new branch] gh/anijain2305/857/orig -> origin/gh/anijain2305/857/orig 2025-09-07T07:46:33.5537890Z * [new branch] gh/anijain2305/858/base -> origin/gh/anijain2305/858/base 2025-09-07T07:46:33.5538485Z * [new branch] gh/anijain2305/858/head -> origin/gh/anijain2305/858/head 2025-09-07T07:46:33.5539083Z * [new branch] gh/anijain2305/858/orig -> origin/gh/anijain2305/858/orig 2025-09-07T07:46:33.5539678Z * [new branch] gh/anijain2305/859/base -> origin/gh/anijain2305/859/base 2025-09-07T07:46:33.5540271Z * [new branch] gh/anijain2305/859/head -> origin/gh/anijain2305/859/head 2025-09-07T07:46:33.5540870Z * [new branch] gh/anijain2305/859/orig -> origin/gh/anijain2305/859/orig 2025-09-07T07:46:33.5541467Z * [new branch] gh/anijain2305/860/base -> origin/gh/anijain2305/860/base 2025-09-07T07:46:33.5542067Z * [new branch] gh/anijain2305/860/head -> origin/gh/anijain2305/860/head 2025-09-07T07:46:33.5542660Z * [new branch] gh/anijain2305/860/orig -> origin/gh/anijain2305/860/orig 2025-09-07T07:46:33.5543250Z * [new branch] gh/anijain2305/861/base -> origin/gh/anijain2305/861/base 2025-09-07T07:46:33.5543851Z * [new branch] gh/anijain2305/861/head -> origin/gh/anijain2305/861/head 2025-09-07T07:46:33.5544454Z * [new branch] gh/anijain2305/861/orig -> origin/gh/anijain2305/861/orig 2025-09-07T07:46:33.5545059Z * [new branch] gh/anijain2305/862/base -> origin/gh/anijain2305/862/base 2025-09-07T07:46:33.5545658Z * [new branch] gh/anijain2305/862/head -> origin/gh/anijain2305/862/head 2025-09-07T07:46:33.5546345Z * [new branch] gh/anijain2305/862/orig -> origin/gh/anijain2305/862/orig 2025-09-07T07:46:33.5546951Z * [new branch] gh/anijain2305/863/base -> origin/gh/anijain2305/863/base 2025-09-07T07:46:33.5547547Z * [new branch] gh/anijain2305/863/head -> origin/gh/anijain2305/863/head 2025-09-07T07:46:33.5548150Z * [new branch] gh/anijain2305/863/orig -> origin/gh/anijain2305/863/orig 2025-09-07T07:46:33.5548755Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-09-07T07:46:33.5549341Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-09-07T07:46:33.5549939Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-09-07T07:46:33.5550542Z * [new branch] gh/anijain2305/865/base -> origin/gh/anijain2305/865/base 2025-09-07T07:46:33.5551135Z * [new branch] gh/anijain2305/865/head -> origin/gh/anijain2305/865/head 2025-09-07T07:46:33.5551735Z * [new branch] gh/anijain2305/865/orig -> origin/gh/anijain2305/865/orig 2025-09-07T07:46:33.5552323Z * [new branch] gh/anijain2305/866/base -> origin/gh/anijain2305/866/base 2025-09-07T07:46:33.5552923Z * [new branch] gh/anijain2305/866/head -> origin/gh/anijain2305/866/head 2025-09-07T07:46:33.5553516Z * [new branch] gh/anijain2305/866/orig -> origin/gh/anijain2305/866/orig 2025-09-07T07:46:33.5554109Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-09-07T07:46:33.5554827Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-09-07T07:46:33.5555497Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-09-07T07:46:33.5556101Z * [new branch] gh/ankitageorge/13/base -> origin/gh/ankitageorge/13/base 2025-09-07T07:46:33.5556721Z * [new branch] gh/ankitageorge/13/head -> origin/gh/ankitageorge/13/head 2025-09-07T07:46:33.5557339Z * [new branch] gh/ankitageorge/13/orig -> origin/gh/ankitageorge/13/orig 2025-09-07T07:46:33.5557946Z * [new branch] gh/ankitageorge/14/base -> origin/gh/ankitageorge/14/base 2025-09-07T07:46:33.5558540Z * [new branch] gh/ankitageorge/14/head -> origin/gh/ankitageorge/14/head 2025-09-07T07:46:33.5559151Z * [new branch] gh/ankitageorge/14/orig -> origin/gh/ankitageorge/14/orig 2025-09-07T07:46:33.5559760Z * [new branch] gh/ankitageorge/15/base -> origin/gh/ankitageorge/15/base 2025-09-07T07:46:33.5560369Z * [new branch] gh/ankitageorge/15/head -> origin/gh/ankitageorge/15/head 2025-09-07T07:46:33.5560972Z * [new branch] gh/ankitageorge/15/orig -> origin/gh/ankitageorge/15/orig 2025-09-07T07:46:33.5561594Z * [new branch] gh/ankitageorge/16/base -> origin/gh/ankitageorge/16/base 2025-09-07T07:46:33.5562206Z * [new branch] gh/ankitageorge/16/head -> origin/gh/ankitageorge/16/head 2025-09-07T07:46:33.5562810Z * [new branch] gh/ankitageorge/16/orig -> origin/gh/ankitageorge/16/orig 2025-09-07T07:46:33.5563552Z * [new branch] gh/ankitageorge/17/base -> origin/gh/ankitageorge/17/base 2025-09-07T07:46:33.5564157Z * [new branch] gh/ankitageorge/17/head -> origin/gh/ankitageorge/17/head 2025-09-07T07:46:33.5564770Z * [new branch] gh/ankitageorge/17/orig -> origin/gh/ankitageorge/17/orig 2025-09-07T07:46:33.5565382Z * [new branch] gh/ankitageorge/21/base -> origin/gh/ankitageorge/21/base 2025-09-07T07:46:33.5566002Z * [new branch] gh/ankitageorge/21/head -> origin/gh/ankitageorge/21/head 2025-09-07T07:46:33.5566606Z * [new branch] gh/ankitageorge/21/orig -> origin/gh/ankitageorge/21/orig 2025-09-07T07:46:33.5567315Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-09-07T07:46:33.5567893Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-09-07T07:46:33.5568469Z * [new branch] gh/anshul-si/15/base -> origin/gh/anshul-si/15/base 2025-09-07T07:46:33.5569049Z * [new branch] gh/anshul-si/15/head -> origin/gh/anshul-si/15/head 2025-09-07T07:46:33.5569624Z * [new branch] gh/anshul-si/15/orig -> origin/gh/anshul-si/15/orig 2025-09-07T07:46:33.5570183Z * [new branch] gh/anshul-si/16/base -> origin/gh/anshul-si/16/base 2025-09-07T07:46:33.5570752Z * [new branch] gh/anshul-si/16/head -> origin/gh/anshul-si/16/head 2025-09-07T07:46:33.5571327Z * [new branch] gh/anshul-si/16/orig -> origin/gh/anshul-si/16/orig 2025-09-07T07:46:33.5571900Z * [new branch] gh/anshul-si/17/base -> origin/gh/anshul-si/17/base 2025-09-07T07:46:33.5572472Z * [new branch] gh/anshul-si/17/head -> origin/gh/anshul-si/17/head 2025-09-07T07:46:33.5573032Z * [new branch] gh/anshul-si/17/orig -> origin/gh/anshul-si/17/orig 2025-09-07T07:46:33.5573595Z * [new branch] gh/anshul-si/18/base -> origin/gh/anshul-si/18/base 2025-09-07T07:46:33.5574151Z * [new branch] gh/anshul-si/18/head -> origin/gh/anshul-si/18/head 2025-09-07T07:46:33.5574717Z * [new branch] gh/anshul-si/18/orig -> origin/gh/anshul-si/18/orig 2025-09-07T07:46:33.5575289Z * [new branch] gh/anshul-si/19/base -> origin/gh/anshul-si/19/base 2025-09-07T07:46:33.5575842Z * [new branch] gh/anshul-si/19/head -> origin/gh/anshul-si/19/head 2025-09-07T07:46:33.5576558Z * [new branch] gh/anshul-si/19/orig -> origin/gh/anshul-si/19/orig 2025-09-07T07:46:33.5577138Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-09-07T07:46:33.5577788Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-09-07T07:46:33.5578360Z * [new branch] gh/anshul-si/20/base -> origin/gh/anshul-si/20/base 2025-09-07T07:46:33.5578911Z * [new branch] gh/anshul-si/20/head -> origin/gh/anshul-si/20/head 2025-09-07T07:46:33.5579481Z * [new branch] gh/anshul-si/20/orig -> origin/gh/anshul-si/20/orig 2025-09-07T07:46:33.5580052Z * [new branch] gh/anshul-si/21/base -> origin/gh/anshul-si/21/base 2025-09-07T07:46:33.5580623Z * [new branch] gh/anshul-si/21/head -> origin/gh/anshul-si/21/head 2025-09-07T07:46:33.5581197Z * [new branch] gh/anshul-si/21/orig -> origin/gh/anshul-si/21/orig 2025-09-07T07:46:33.5581758Z * [new branch] gh/anshul-si/22/base -> origin/gh/anshul-si/22/base 2025-09-07T07:46:33.5582317Z * [new branch] gh/anshul-si/22/head -> origin/gh/anshul-si/22/head 2025-09-07T07:46:33.5582877Z * [new branch] gh/anshul-si/22/orig -> origin/gh/anshul-si/22/orig 2025-09-07T07:46:33.5583446Z * [new branch] gh/anshul-si/23/base -> origin/gh/anshul-si/23/base 2025-09-07T07:46:33.5584002Z * [new branch] gh/anshul-si/23/head -> origin/gh/anshul-si/23/head 2025-09-07T07:46:33.5584574Z * [new branch] gh/anshul-si/23/orig -> origin/gh/anshul-si/23/orig 2025-09-07T07:46:33.5585142Z * [new branch] gh/anshul-si/24/base -> origin/gh/anshul-si/24/base 2025-09-07T07:46:33.5585711Z * [new branch] gh/anshul-si/24/head -> origin/gh/anshul-si/24/head 2025-09-07T07:46:33.5586283Z * [new branch] gh/anshul-si/24/orig -> origin/gh/anshul-si/24/orig 2025-09-07T07:46:33.5586843Z * [new branch] gh/anshul-si/25/base -> origin/gh/anshul-si/25/base 2025-09-07T07:46:33.5587528Z * [new branch] gh/anshul-si/25/head -> origin/gh/anshul-si/25/head 2025-09-07T07:46:33.5588107Z * [new branch] gh/anshul-si/25/orig -> origin/gh/anshul-si/25/orig 2025-09-07T07:46:33.5588670Z * [new branch] gh/anshul-si/26/base -> origin/gh/anshul-si/26/base 2025-09-07T07:46:33.5589244Z * [new branch] gh/anshul-si/26/head -> origin/gh/anshul-si/26/head 2025-09-07T07:46:33.5589806Z * [new branch] gh/anshul-si/26/orig -> origin/gh/anshul-si/26/orig 2025-09-07T07:46:33.5590377Z * [new branch] gh/anshul-si/27/base -> origin/gh/anshul-si/27/base 2025-09-07T07:46:33.5590943Z * [new branch] gh/anshul-si/27/head -> origin/gh/anshul-si/27/head 2025-09-07T07:46:33.5591513Z * [new branch] gh/anshul-si/27/orig -> origin/gh/anshul-si/27/orig 2025-09-07T07:46:33.5592080Z * [new branch] gh/anshul-si/28/base -> origin/gh/anshul-si/28/base 2025-09-07T07:46:33.5592303Z * [new branch] gh/anshul-si/28/head -> origin/gh/anshul-si/28/head 2025-09-07T07:46:33.5592525Z * [new branch] gh/anshul-si/28/orig -> origin/gh/anshul-si/28/orig 2025-09-07T07:46:33.5592755Z * [new branch] gh/anshul-si/29/base -> origin/gh/anshul-si/29/base 2025-09-07T07:46:33.5592978Z * [new branch] gh/anshul-si/29/head -> origin/gh/anshul-si/29/head 2025-09-07T07:46:33.5593211Z * [new branch] gh/anshul-si/29/orig -> origin/gh/anshul-si/29/orig 2025-09-07T07:46:33.5593431Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-09-07T07:46:33.5593659Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-09-07T07:46:33.5593972Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-09-07T07:46:33.5594192Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-09-07T07:46:33.5594423Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-09-07T07:46:33.5594637Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-09-07T07:46:33.5594879Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-09-07T07:46:33.5595102Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-09-07T07:46:33.5595334Z * [new branch] gh/bdhirsh/650/base -> origin/gh/bdhirsh/650/base 2025-09-07T07:46:33.5595551Z * [new branch] gh/bdhirsh/650/head -> origin/gh/bdhirsh/650/head 2025-09-07T07:46:33.5595765Z * [new branch] gh/bdhirsh/650/orig -> origin/gh/bdhirsh/650/orig 2025-09-07T07:46:33.5595997Z * [new branch] gh/bdhirsh/663/base -> origin/gh/bdhirsh/663/base 2025-09-07T07:46:33.5596217Z * [new branch] gh/bdhirsh/663/head -> origin/gh/bdhirsh/663/head 2025-09-07T07:46:33.5596452Z * [new branch] gh/bdhirsh/663/orig -> origin/gh/bdhirsh/663/orig 2025-09-07T07:46:33.5596670Z * [new branch] gh/bdhirsh/665/base -> origin/gh/bdhirsh/665/base 2025-09-07T07:46:33.5596887Z * [new branch] gh/bdhirsh/665/head -> origin/gh/bdhirsh/665/head 2025-09-07T07:46:33.5597115Z * [new branch] gh/bdhirsh/665/orig -> origin/gh/bdhirsh/665/orig 2025-09-07T07:46:33.5597330Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-09-07T07:46:33.5597563Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-09-07T07:46:33.5597782Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-09-07T07:46:33.5598013Z * [new branch] gh/bdhirsh/667/base -> origin/gh/bdhirsh/667/base 2025-09-07T07:46:33.5598417Z * [new branch] gh/bdhirsh/667/head -> origin/gh/bdhirsh/667/head 2025-09-07T07:46:33.5598637Z * [new branch] gh/bdhirsh/667/orig -> origin/gh/bdhirsh/667/orig 2025-09-07T07:46:33.5598869Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-09-07T07:46:33.5599087Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-09-07T07:46:33.5599317Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-09-07T07:46:33.5599534Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-09-07T07:46:33.5599749Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-09-07T07:46:33.5599984Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-09-07T07:46:33.5600200Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-09-07T07:46:33.5600430Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-09-07T07:46:33.5600650Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-09-07T07:46:33.5600931Z * [new branch] gh/benjaminglass1/100/base -> origin/gh/benjaminglass1/100/base 2025-09-07T07:46:33.5601198Z * [new branch] gh/benjaminglass1/100/head -> origin/gh/benjaminglass1/100/head 2025-09-07T07:46:33.5601456Z * [new branch] gh/benjaminglass1/100/orig -> origin/gh/benjaminglass1/100/orig 2025-09-07T07:46:33.5601731Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-09-07T07:46:33.5601990Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-09-07T07:46:33.5602352Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-09-07T07:46:33.5602614Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-09-07T07:46:33.5603016Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-09-07T07:46:33.5603279Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-09-07T07:46:33.5603540Z * [new branch] gh/benjaminglass1/103/base -> origin/gh/benjaminglass1/103/base 2025-09-07T07:46:33.5603814Z * [new branch] gh/benjaminglass1/103/head -> origin/gh/benjaminglass1/103/head 2025-09-07T07:46:33.5604071Z * [new branch] gh/benjaminglass1/103/orig -> origin/gh/benjaminglass1/103/orig 2025-09-07T07:46:33.5604346Z * [new branch] gh/benjaminglass1/104/base -> origin/gh/benjaminglass1/104/base 2025-09-07T07:46:33.5604609Z * [new branch] gh/benjaminglass1/104/head -> origin/gh/benjaminglass1/104/head 2025-09-07T07:46:33.5604884Z * [new branch] gh/benjaminglass1/104/orig -> origin/gh/benjaminglass1/104/orig 2025-09-07T07:46:33.5605147Z * [new branch] gh/benjaminglass1/105/base -> origin/gh/benjaminglass1/105/base 2025-09-07T07:46:33.5605406Z * [new branch] gh/benjaminglass1/105/head -> origin/gh/benjaminglass1/105/head 2025-09-07T07:46:33.5605677Z * [new branch] gh/benjaminglass1/105/orig -> origin/gh/benjaminglass1/105/orig 2025-09-07T07:46:33.5605937Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-09-07T07:46:33.5606205Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-09-07T07:46:33.5606463Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-09-07T07:46:33.5606720Z * [new branch] gh/benjaminglass1/79/base -> origin/gh/benjaminglass1/79/base 2025-09-07T07:46:33.5606987Z * [new branch] gh/benjaminglass1/79/head -> origin/gh/benjaminglass1/79/head 2025-09-07T07:46:33.5607392Z * [new branch] gh/benjaminglass1/79/orig -> origin/gh/benjaminglass1/79/orig 2025-09-07T07:46:33.5607668Z * [new branch] gh/benjaminglass1/86/base -> origin/gh/benjaminglass1/86/base 2025-09-07T07:46:33.5607922Z * [new branch] gh/benjaminglass1/86/head -> origin/gh/benjaminglass1/86/head 2025-09-07T07:46:33.5608189Z * [new branch] gh/benjaminglass1/86/orig -> origin/gh/benjaminglass1/86/orig 2025-09-07T07:46:33.5608443Z * [new branch] gh/benjaminglass1/89/base -> origin/gh/benjaminglass1/89/base 2025-09-07T07:46:33.5608695Z * [new branch] gh/benjaminglass1/89/head -> origin/gh/benjaminglass1/89/head 2025-09-07T07:46:33.5608955Z * [new branch] gh/benjaminglass1/89/orig -> origin/gh/benjaminglass1/89/orig 2025-09-07T07:46:33.5609215Z * [new branch] gh/benjaminglass1/91/base -> origin/gh/benjaminglass1/91/base 2025-09-07T07:46:33.5609484Z * [new branch] gh/benjaminglass1/91/head -> origin/gh/benjaminglass1/91/head 2025-09-07T07:46:33.5609738Z * [new branch] gh/benjaminglass1/91/orig -> origin/gh/benjaminglass1/91/orig 2025-09-07T07:46:33.5610007Z * [new branch] gh/benjaminglass1/93/base -> origin/gh/benjaminglass1/93/base 2025-09-07T07:46:33.5610261Z * [new branch] gh/benjaminglass1/93/head -> origin/gh/benjaminglass1/93/head 2025-09-07T07:46:33.5610514Z * [new branch] gh/benjaminglass1/93/orig -> origin/gh/benjaminglass1/93/orig 2025-09-07T07:46:33.5610783Z * [new branch] gh/benjaminglass1/95/base -> origin/gh/benjaminglass1/95/base 2025-09-07T07:46:33.5611038Z * [new branch] gh/benjaminglass1/95/head -> origin/gh/benjaminglass1/95/head 2025-09-07T07:46:33.5611416Z * [new branch] gh/benjaminglass1/95/orig -> origin/gh/benjaminglass1/95/orig 2025-09-07T07:46:33.5611670Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-09-07T07:46:33.5611938Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-09-07T07:46:33.5612192Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-09-07T07:46:33.5612444Z * [new branch] gh/benjaminglass1/99/base -> origin/gh/benjaminglass1/99/base 2025-09-07T07:46:33.5612710Z * [new branch] gh/benjaminglass1/99/head -> origin/gh/benjaminglass1/99/head 2025-09-07T07:46:33.5612965Z * [new branch] gh/benjaminglass1/99/orig -> origin/gh/benjaminglass1/99/orig 2025-09-07T07:46:33.5613213Z * [new branch] gh/bobrenjc93/514/base -> origin/gh/bobrenjc93/514/base 2025-09-07T07:46:33.5613447Z * [new branch] gh/bobrenjc93/514/head -> origin/gh/bobrenjc93/514/head 2025-09-07T07:46:33.5613685Z * [new branch] gh/bobrenjc93/514/orig -> origin/gh/bobrenjc93/514/orig 2025-09-07T07:46:33.5613931Z * [new branch] gh/bobrenjc93/521/base -> origin/gh/bobrenjc93/521/base 2025-09-07T07:46:33.5614159Z * [new branch] gh/bobrenjc93/521/head -> origin/gh/bobrenjc93/521/head 2025-09-07T07:46:33.5614401Z * [new branch] gh/bobrenjc93/521/orig -> origin/gh/bobrenjc93/521/orig 2025-09-07T07:46:33.5614633Z * [new branch] gh/bobrenjc93/522/base -> origin/gh/bobrenjc93/522/base 2025-09-07T07:46:33.5614876Z * [new branch] gh/bobrenjc93/522/head -> origin/gh/bobrenjc93/522/head 2025-09-07T07:46:33.5615106Z * [new branch] gh/bobrenjc93/522/orig -> origin/gh/bobrenjc93/522/orig 2025-09-07T07:46:33.5615335Z * [new branch] gh/bobrenjc93/525/base -> origin/gh/bobrenjc93/525/base 2025-09-07T07:46:33.5615586Z * [new branch] gh/bobrenjc93/525/head -> origin/gh/bobrenjc93/525/head 2025-09-07T07:46:33.5615818Z * [new branch] gh/bobrenjc93/525/orig -> origin/gh/bobrenjc93/525/orig 2025-09-07T07:46:33.5616161Z * [new branch] gh/bobrenjc93/526/base -> origin/gh/bobrenjc93/526/base 2025-09-07T07:46:33.5616398Z * [new branch] gh/bobrenjc93/526/head -> origin/gh/bobrenjc93/526/head 2025-09-07T07:46:33.5616641Z * [new branch] gh/bobrenjc93/526/orig -> origin/gh/bobrenjc93/526/orig 2025-09-07T07:46:33.5616875Z * [new branch] gh/bobrenjc93/527/base -> origin/gh/bobrenjc93/527/base 2025-09-07T07:46:33.5617105Z * [new branch] gh/bobrenjc93/527/head -> origin/gh/bobrenjc93/527/head 2025-09-07T07:46:33.5617436Z * [new branch] gh/bobrenjc93/527/orig -> origin/gh/bobrenjc93/527/orig 2025-09-07T07:46:33.5617677Z * [new branch] gh/bobrenjc93/528/base -> origin/gh/bobrenjc93/528/base 2025-09-07T07:46:33.5617930Z * [new branch] gh/bobrenjc93/528/head -> origin/gh/bobrenjc93/528/head 2025-09-07T07:46:33.5618167Z * [new branch] gh/bobrenjc93/528/orig -> origin/gh/bobrenjc93/528/orig 2025-09-07T07:46:33.5618399Z * [new branch] gh/bobrenjc93/529/base -> origin/gh/bobrenjc93/529/base 2025-09-07T07:46:33.5618645Z * [new branch] gh/bobrenjc93/529/head -> origin/gh/bobrenjc93/529/head 2025-09-07T07:46:33.5618876Z * [new branch] gh/bobrenjc93/529/orig -> origin/gh/bobrenjc93/529/orig 2025-09-07T07:46:33.5619114Z * [new branch] gh/bobrenjc93/535/base -> origin/gh/bobrenjc93/535/base 2025-09-07T07:46:33.5619348Z * [new branch] gh/bobrenjc93/535/head -> origin/gh/bobrenjc93/535/head 2025-09-07T07:46:33.5619594Z * [new branch] gh/bobrenjc93/535/orig -> origin/gh/bobrenjc93/535/orig 2025-09-07T07:46:33.5619826Z * [new branch] gh/bobrenjc93/537/base -> origin/gh/bobrenjc93/537/base 2025-09-07T07:46:33.5620159Z * [new branch] gh/bobrenjc93/537/head -> origin/gh/bobrenjc93/537/head 2025-09-07T07:46:33.5620409Z * [new branch] gh/bobrenjc93/537/orig -> origin/gh/bobrenjc93/537/orig 2025-09-07T07:46:33.5620640Z * [new branch] gh/bobrenjc93/539/base -> origin/gh/bobrenjc93/539/base 2025-09-07T07:46:33.5620885Z * [new branch] gh/bobrenjc93/539/head -> origin/gh/bobrenjc93/539/head 2025-09-07T07:46:33.5621117Z * [new branch] gh/bobrenjc93/539/orig -> origin/gh/bobrenjc93/539/orig 2025-09-07T07:46:33.5621365Z * [new branch] gh/bobrenjc93/540/base -> origin/gh/bobrenjc93/540/base 2025-09-07T07:46:33.5621597Z * [new branch] gh/bobrenjc93/540/head -> origin/gh/bobrenjc93/540/head 2025-09-07T07:46:33.5621827Z * [new branch] gh/bobrenjc93/540/orig -> origin/gh/bobrenjc93/540/orig 2025-09-07T07:46:33.5622068Z * [new branch] gh/bobrenjc93/541/base -> origin/gh/bobrenjc93/541/base 2025-09-07T07:46:33.5622301Z * [new branch] gh/bobrenjc93/541/head -> origin/gh/bobrenjc93/541/head 2025-09-07T07:46:33.5622552Z * [new branch] gh/bobrenjc93/541/orig -> origin/gh/bobrenjc93/541/orig 2025-09-07T07:46:33.5622790Z * [new branch] gh/bobrenjc93/542/base -> origin/gh/bobrenjc93/542/base 2025-09-07T07:46:33.5623025Z * [new branch] gh/bobrenjc93/542/head -> origin/gh/bobrenjc93/542/head 2025-09-07T07:46:33.5623272Z * [new branch] gh/bobrenjc93/542/orig -> origin/gh/bobrenjc93/542/orig 2025-09-07T07:46:33.5623506Z * [new branch] gh/bobrenjc93/543/base -> origin/gh/bobrenjc93/543/base 2025-09-07T07:46:33.5623752Z * [new branch] gh/bobrenjc93/543/head -> origin/gh/bobrenjc93/543/head 2025-09-07T07:46:33.5623985Z * [new branch] gh/bobrenjc93/543/orig -> origin/gh/bobrenjc93/543/orig 2025-09-07T07:46:33.5624234Z * [new branch] gh/bobrenjc93/544/base -> origin/gh/bobrenjc93/544/base 2025-09-07T07:46:33.5624560Z * [new branch] gh/bobrenjc93/544/head -> origin/gh/bobrenjc93/544/head 2025-09-07T07:46:33.5624799Z * [new branch] gh/bobrenjc93/544/orig -> origin/gh/bobrenjc93/544/orig 2025-09-07T07:46:33.5625047Z * [new branch] gh/bobrenjc93/545/base -> origin/gh/bobrenjc93/545/base 2025-09-07T07:46:33.5625280Z * [new branch] gh/bobrenjc93/545/head -> origin/gh/bobrenjc93/545/head 2025-09-07T07:46:33.5625520Z * [new branch] gh/bobrenjc93/545/orig -> origin/gh/bobrenjc93/545/orig 2025-09-07T07:46:33.5625756Z * [new branch] gh/bobrenjc93/546/base -> origin/gh/bobrenjc93/546/base 2025-09-07T07:46:33.5626005Z * [new branch] gh/bobrenjc93/546/head -> origin/gh/bobrenjc93/546/head 2025-09-07T07:46:33.5626244Z * [new branch] gh/bobrenjc93/546/orig -> origin/gh/bobrenjc93/546/orig 2025-09-07T07:46:33.5626482Z * [new branch] gh/bobrenjc93/547/base -> origin/gh/bobrenjc93/547/base 2025-09-07T07:46:33.5626736Z * [new branch] gh/bobrenjc93/547/head -> origin/gh/bobrenjc93/547/head 2025-09-07T07:46:33.5626972Z * [new branch] gh/bobrenjc93/547/orig -> origin/gh/bobrenjc93/547/orig 2025-09-07T07:46:33.5627213Z * [new branch] gh/bobrenjc93/548/base -> origin/gh/bobrenjc93/548/base 2025-09-07T07:46:33.5627442Z * [new branch] gh/bobrenjc93/548/head -> origin/gh/bobrenjc93/548/head 2025-09-07T07:46:33.5627684Z * [new branch] gh/bobrenjc93/548/orig -> origin/gh/bobrenjc93/548/orig 2025-09-07T07:46:33.5627918Z * [new branch] gh/bobrenjc93/549/base -> origin/gh/bobrenjc93/549/base 2025-09-07T07:46:33.5628149Z * [new branch] gh/bobrenjc93/549/head -> origin/gh/bobrenjc93/549/head 2025-09-07T07:46:33.5628489Z * [new branch] gh/bobrenjc93/549/orig -> origin/gh/bobrenjc93/549/orig 2025-09-07T07:46:33.5628723Z * [new branch] gh/bobrenjc93/550/base -> origin/gh/bobrenjc93/550/base 2025-09-07T07:46:33.5628960Z * [new branch] gh/bobrenjc93/550/head -> origin/gh/bobrenjc93/550/head 2025-09-07T07:46:33.5629193Z * [new branch] gh/bobrenjc93/550/orig -> origin/gh/bobrenjc93/550/orig 2025-09-07T07:46:33.5629421Z * [new branch] gh/bobrenjc93/551/base -> origin/gh/bobrenjc93/551/base 2025-09-07T07:46:33.5629663Z * [new branch] gh/bobrenjc93/551/head -> origin/gh/bobrenjc93/551/head 2025-09-07T07:46:33.5629895Z * [new branch] gh/bobrenjc93/551/orig -> origin/gh/bobrenjc93/551/orig 2025-09-07T07:46:33.5630140Z * [new branch] gh/bobrenjc93/552/base -> origin/gh/bobrenjc93/552/base 2025-09-07T07:46:33.5630374Z * [new branch] gh/bobrenjc93/552/head -> origin/gh/bobrenjc93/552/head 2025-09-07T07:46:33.5630624Z * [new branch] gh/bobrenjc93/552/orig -> origin/gh/bobrenjc93/552/orig 2025-09-07T07:46:33.5630857Z * [new branch] gh/bobrenjc93/553/base -> origin/gh/bobrenjc93/553/base 2025-09-07T07:46:33.5631090Z * [new branch] gh/bobrenjc93/553/head -> origin/gh/bobrenjc93/553/head 2025-09-07T07:46:33.5631326Z * [new branch] gh/bobrenjc93/553/orig -> origin/gh/bobrenjc93/553/orig 2025-09-07T07:46:33.5631561Z * [new branch] gh/bobrenjc93/554/base -> origin/gh/bobrenjc93/554/base 2025-09-07T07:46:33.5631806Z * [new branch] gh/bobrenjc93/554/head -> origin/gh/bobrenjc93/554/head 2025-09-07T07:46:33.5632038Z * [new branch] gh/bobrenjc93/554/orig -> origin/gh/bobrenjc93/554/orig 2025-09-07T07:46:33.5632281Z * [new branch] gh/bobrenjc93/555/base -> origin/gh/bobrenjc93/555/base 2025-09-07T07:46:33.5632517Z * [new branch] gh/bobrenjc93/555/head -> origin/gh/bobrenjc93/555/head 2025-09-07T07:46:33.5632749Z * [new branch] gh/bobrenjc93/555/orig -> origin/gh/bobrenjc93/555/orig 2025-09-07T07:46:33.5633108Z * [new branch] gh/bobrenjc93/556/base -> origin/gh/bobrenjc93/556/base 2025-09-07T07:46:33.5633343Z * [new branch] gh/bobrenjc93/556/head -> origin/gh/bobrenjc93/556/head 2025-09-07T07:46:33.5633590Z * [new branch] gh/bobrenjc93/556/orig -> origin/gh/bobrenjc93/556/orig 2025-09-07T07:46:33.5633842Z * [new branch] gh/briancoutinho/2/base -> origin/gh/briancoutinho/2/base 2025-09-07T07:46:33.5634088Z * [new branch] gh/briancoutinho/2/head -> origin/gh/briancoutinho/2/head 2025-09-07T07:46:33.5634304Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-09-07T07:46:33.5634502Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-09-07T07:46:33.5634718Z * [new branch] gh/c00w/48/base -> origin/gh/c00w/48/base 2025-09-07T07:46:33.5634916Z * [new branch] gh/c00w/48/head -> origin/gh/c00w/48/head 2025-09-07T07:46:33.5635126Z * [new branch] gh/c00w/48/orig -> origin/gh/c00w/48/orig 2025-09-07T07:46:33.5635321Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-09-07T07:46:33.5635517Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-09-07T07:46:33.5635725Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-09-07T07:46:33.5635919Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-09-07T07:46:33.5636129Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-09-07T07:46:33.5636324Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-09-07T07:46:33.5636623Z * [new branch] gh/c00w/55/base -> origin/gh/c00w/55/base 2025-09-07T07:46:33.5636832Z * [new branch] gh/c00w/55/head -> origin/gh/c00w/55/head 2025-09-07T07:46:33.5637032Z * [new branch] gh/c00w/55/orig -> origin/gh/c00w/55/orig 2025-09-07T07:46:33.5637240Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-09-07T07:46:33.5637428Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-09-07T07:46:33.5637635Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-09-07T07:46:33.5637854Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-09-07T07:46:33.5638066Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-09-07T07:46:33.5638292Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-09-07T07:46:33.5638540Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-09-07T07:46:33.5638796Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-09-07T07:46:33.5639043Z * [new branch] gh/coconutruben/11/base -> origin/gh/coconutruben/11/base 2025-09-07T07:46:33.5639302Z * [new branch] gh/coconutruben/11/head -> origin/gh/coconutruben/11/head 2025-09-07T07:46:33.5639546Z * [new branch] gh/coconutruben/11/orig -> origin/gh/coconutruben/11/orig 2025-09-07T07:46:33.5639789Z * [new branch] gh/coconutruben/12/base -> origin/gh/coconutruben/12/base 2025-09-07T07:46:33.5640045Z * [new branch] gh/coconutruben/12/head -> origin/gh/coconutruben/12/head 2025-09-07T07:46:33.5640287Z * [new branch] gh/coconutruben/12/orig -> origin/gh/coconutruben/12/orig 2025-09-07T07:46:33.5640541Z * [new branch] gh/coconutruben/13/base -> origin/gh/coconutruben/13/base 2025-09-07T07:46:33.5640786Z * [new branch] gh/coconutruben/13/head -> origin/gh/coconutruben/13/head 2025-09-07T07:46:33.5641125Z * [new branch] gh/coconutruben/13/orig -> origin/gh/coconutruben/13/orig 2025-09-07T07:46:33.5641379Z * [new branch] gh/coconutruben/14/base -> origin/gh/coconutruben/14/base 2025-09-07T07:46:33.5641620Z * [new branch] gh/coconutruben/14/head -> origin/gh/coconutruben/14/head 2025-09-07T07:46:33.5641875Z * [new branch] gh/coconutruben/14/orig -> origin/gh/coconutruben/14/orig 2025-09-07T07:46:33.5642120Z * [new branch] gh/coconutruben/15/base -> origin/gh/coconutruben/15/base 2025-09-07T07:46:33.5642374Z * [new branch] gh/coconutruben/15/head -> origin/gh/coconutruben/15/head 2025-09-07T07:46:33.5642617Z * [new branch] gh/coconutruben/15/orig -> origin/gh/coconutruben/15/orig 2025-09-07T07:46:33.5642996Z * [new branch] gh/coconutruben/16/base -> origin/gh/coconutruben/16/base 2025-09-07T07:46:33.5643267Z * [new branch] gh/coconutruben/16/head -> origin/gh/coconutruben/16/head 2025-09-07T07:46:33.5643516Z * [new branch] gh/coconutruben/16/orig -> origin/gh/coconutruben/16/orig 2025-09-07T07:46:33.5643773Z * [new branch] gh/coconutruben/17/base -> origin/gh/coconutruben/17/base 2025-09-07T07:46:33.5644018Z * [new branch] gh/coconutruben/17/head -> origin/gh/coconutruben/17/head 2025-09-07T07:46:33.5644277Z * [new branch] gh/coconutruben/17/orig -> origin/gh/coconutruben/17/orig 2025-09-07T07:46:33.5644520Z * [new branch] gh/coconutruben/18/base -> origin/gh/coconutruben/18/base 2025-09-07T07:46:33.5644765Z * [new branch] gh/coconutruben/18/head -> origin/gh/coconutruben/18/head 2025-09-07T07:46:33.5645023Z * [new branch] gh/coconutruben/18/orig -> origin/gh/coconutruben/18/orig 2025-09-07T07:46:33.5645373Z * [new branch] gh/coconutruben/19/base -> origin/gh/coconutruben/19/base 2025-09-07T07:46:33.5645633Z * [new branch] gh/coconutruben/19/head -> origin/gh/coconutruben/19/head 2025-09-07T07:46:33.5645876Z * [new branch] gh/coconutruben/19/orig -> origin/gh/coconutruben/19/orig 2025-09-07T07:46:33.5646119Z * [new branch] gh/coconutruben/20/base -> origin/gh/coconutruben/20/base 2025-09-07T07:46:33.5646377Z * [new branch] gh/coconutruben/20/head -> origin/gh/coconutruben/20/head 2025-09-07T07:46:33.5646622Z * [new branch] gh/coconutruben/20/orig -> origin/gh/coconutruben/20/orig 2025-09-07T07:46:33.5646880Z * [new branch] gh/coconutruben/21/base -> origin/gh/coconutruben/21/base 2025-09-07T07:46:33.5647124Z * [new branch] gh/coconutruben/21/head -> origin/gh/coconutruben/21/head 2025-09-07T07:46:33.5647387Z * [new branch] gh/coconutruben/21/orig -> origin/gh/coconutruben/21/orig 2025-09-07T07:46:33.5647631Z * [new branch] gh/coconutruben/22/base -> origin/gh/coconutruben/22/base 2025-09-07T07:46:33.5647875Z * [new branch] gh/coconutruben/22/head -> origin/gh/coconutruben/22/head 2025-09-07T07:46:33.5648129Z * [new branch] gh/coconutruben/22/orig -> origin/gh/coconutruben/22/orig 2025-09-07T07:46:33.5648367Z * [new branch] gh/coconutruben/24/base -> origin/gh/coconutruben/24/base 2025-09-07T07:46:33.5648620Z * [new branch] gh/coconutruben/24/head -> origin/gh/coconutruben/24/head 2025-09-07T07:46:33.5648863Z * [new branch] gh/coconutruben/24/orig -> origin/gh/coconutruben/24/orig 2025-09-07T07:46:33.5649116Z * [new branch] gh/coconutruben/25/base -> origin/gh/coconutruben/25/base 2025-09-07T07:46:33.5649354Z * [new branch] gh/coconutruben/25/head -> origin/gh/coconutruben/25/head 2025-09-07T07:46:33.5649600Z * [new branch] gh/coconutruben/25/orig -> origin/gh/coconutruben/25/orig 2025-09-07T07:46:33.5649967Z * [new branch] gh/coconutruben/28/base -> origin/gh/coconutruben/28/base 2025-09-07T07:46:33.5650212Z * [new branch] gh/coconutruben/28/head -> origin/gh/coconutruben/28/head 2025-09-07T07:46:33.5650462Z * [new branch] gh/coconutruben/28/orig -> origin/gh/coconutruben/28/orig 2025-09-07T07:46:33.5650704Z * [new branch] gh/coconutruben/29/base -> origin/gh/coconutruben/29/base 2025-09-07T07:46:33.5650960Z * [new branch] gh/coconutruben/29/head -> origin/gh/coconutruben/29/head 2025-09-07T07:46:33.5651202Z * [new branch] gh/coconutruben/29/orig -> origin/gh/coconutruben/29/orig 2025-09-07T07:46:33.5651442Z * [new branch] gh/coconutruben/30/base -> origin/gh/coconutruben/30/base 2025-09-07T07:46:33.5651700Z * [new branch] gh/coconutruben/30/head -> origin/gh/coconutruben/30/head 2025-09-07T07:46:33.5651941Z * [new branch] gh/coconutruben/30/orig -> origin/gh/coconutruben/30/orig 2025-09-07T07:46:33.5652200Z * [new branch] gh/coconutruben/31/base -> origin/gh/coconutruben/31/base 2025-09-07T07:46:33.5652440Z * [new branch] gh/coconutruben/31/head -> origin/gh/coconutruben/31/head 2025-09-07T07:46:33.5652682Z * [new branch] gh/coconutruben/31/orig -> origin/gh/coconutruben/31/orig 2025-09-07T07:46:33.5652936Z * [new branch] gh/coconutruben/32/base -> origin/gh/coconutruben/32/base 2025-09-07T07:46:33.5653182Z * [new branch] gh/coconutruben/32/head -> origin/gh/coconutruben/32/head 2025-09-07T07:46:33.5653433Z * [new branch] gh/coconutruben/32/orig -> origin/gh/coconutruben/32/orig 2025-09-07T07:46:33.5653673Z * [new branch] gh/coconutruben/33/base -> origin/gh/coconutruben/33/base 2025-09-07T07:46:33.5654023Z * [new branch] gh/coconutruben/33/head -> origin/gh/coconutruben/33/head 2025-09-07T07:46:33.5654269Z * [new branch] gh/coconutruben/33/orig -> origin/gh/coconutruben/33/orig 2025-09-07T07:46:33.5654510Z * [new branch] gh/coconutruben/34/base -> origin/gh/coconutruben/34/base 2025-09-07T07:46:33.5654764Z * [new branch] gh/coconutruben/34/head -> origin/gh/coconutruben/34/head 2025-09-07T07:46:33.5655007Z * [new branch] gh/coconutruben/34/orig -> origin/gh/coconutruben/34/orig 2025-09-07T07:46:33.5655262Z * [new branch] gh/coconutruben/35/base -> origin/gh/coconutruben/35/base 2025-09-07T07:46:33.5655503Z * [new branch] gh/coconutruben/35/head -> origin/gh/coconutruben/35/head 2025-09-07T07:46:33.5655758Z * [new branch] gh/coconutruben/35/orig -> origin/gh/coconutruben/35/orig 2025-09-07T07:46:33.5656003Z * [new branch] gh/coconutruben/36/base -> origin/gh/coconutruben/36/base 2025-09-07T07:46:33.5656244Z * [new branch] gh/coconutruben/36/head -> origin/gh/coconutruben/36/head 2025-09-07T07:46:33.5656500Z * [new branch] gh/coconutruben/36/orig -> origin/gh/coconutruben/36/orig 2025-09-07T07:46:33.5656740Z * [new branch] gh/coconutruben/37/base -> origin/gh/coconutruben/37/base 2025-09-07T07:46:33.5656990Z * [new branch] gh/coconutruben/37/head -> origin/gh/coconutruben/37/head 2025-09-07T07:46:33.5657235Z * [new branch] gh/coconutruben/37/orig -> origin/gh/coconutruben/37/orig 2025-09-07T07:46:33.5657566Z * [new branch] gh/coconutruben/38/base -> origin/gh/coconutruben/38/base 2025-09-07T07:46:33.5657830Z * [new branch] gh/coconutruben/38/head -> origin/gh/coconutruben/38/head 2025-09-07T07:46:33.5658075Z * [new branch] gh/coconutruben/38/orig -> origin/gh/coconutruben/38/orig 2025-09-07T07:46:33.5658336Z * [new branch] gh/coconutruben/39/base -> origin/gh/coconutruben/39/base 2025-09-07T07:46:33.5658676Z * [new branch] gh/coconutruben/39/head -> origin/gh/coconutruben/39/head 2025-09-07T07:46:33.5658938Z * [new branch] gh/coconutruben/39/orig -> origin/gh/coconutruben/39/orig 2025-09-07T07:46:33.5659181Z * [new branch] gh/coconutruben/40/base -> origin/gh/coconutruben/40/base 2025-09-07T07:46:33.5659421Z * [new branch] gh/coconutruben/40/head -> origin/gh/coconutruben/40/head 2025-09-07T07:46:33.5659676Z * [new branch] gh/coconutruben/40/orig -> origin/gh/coconutruben/40/orig 2025-09-07T07:46:33.5659921Z * [new branch] gh/coconutruben/41/base -> origin/gh/coconutruben/41/base 2025-09-07T07:46:33.5660168Z * [new branch] gh/coconutruben/41/head -> origin/gh/coconutruben/41/head 2025-09-07T07:46:33.5660413Z * [new branch] gh/coconutruben/41/orig -> origin/gh/coconutruben/41/orig 2025-09-07T07:46:33.5660663Z * [new branch] gh/coconutruben/42/base -> origin/gh/coconutruben/42/base 2025-09-07T07:46:33.5660932Z * [new branch] gh/coconutruben/42/head -> origin/gh/coconutruben/42/head 2025-09-07T07:46:33.5661171Z * [new branch] gh/coconutruben/42/orig -> origin/gh/coconutruben/42/orig 2025-09-07T07:46:33.5661420Z * [new branch] gh/coconutruben/43/base -> origin/gh/coconutruben/43/base 2025-09-07T07:46:33.5661660Z * [new branch] gh/coconutruben/43/head -> origin/gh/coconutruben/43/head 2025-09-07T07:46:33.5661913Z * [new branch] gh/coconutruben/43/orig -> origin/gh/coconutruben/43/orig 2025-09-07T07:46:33.5662151Z * [new branch] gh/coconutruben/44/base -> origin/gh/coconutruben/44/base 2025-09-07T07:46:33.5662390Z * [new branch] gh/coconutruben/44/head -> origin/gh/coconutruben/44/head 2025-09-07T07:46:33.5662753Z * [new branch] gh/coconutruben/44/orig -> origin/gh/coconutruben/44/orig 2025-09-07T07:46:33.5662994Z * [new branch] gh/coconutruben/45/base -> origin/gh/coconutruben/45/base 2025-09-07T07:46:33.5663238Z * [new branch] gh/coconutruben/45/head -> origin/gh/coconutruben/45/head 2025-09-07T07:46:33.5663480Z * [new branch] gh/coconutruben/45/orig -> origin/gh/coconutruben/45/orig 2025-09-07T07:46:33.5663718Z * [new branch] gh/coconutruben/46/base -> origin/gh/coconutruben/46/base 2025-09-07T07:46:33.5663970Z * [new branch] gh/coconutruben/46/head -> origin/gh/coconutruben/46/head 2025-09-07T07:46:33.5664213Z * [new branch] gh/coconutruben/46/orig -> origin/gh/coconutruben/46/orig 2025-09-07T07:46:33.5664465Z * [new branch] gh/coconutruben/47/base -> origin/gh/coconutruben/47/base 2025-09-07T07:46:33.5664705Z * [new branch] gh/coconutruben/47/head -> origin/gh/coconutruben/47/head 2025-09-07T07:46:33.5664961Z * [new branch] gh/coconutruben/47/orig -> origin/gh/coconutruben/47/orig 2025-09-07T07:46:33.5665207Z * [new branch] gh/coconutruben/48/base -> origin/gh/coconutruben/48/base 2025-09-07T07:46:33.5665450Z * [new branch] gh/coconutruben/48/head -> origin/gh/coconutruben/48/head 2025-09-07T07:46:33.5665704Z * [new branch] gh/coconutruben/48/orig -> origin/gh/coconutruben/48/orig 2025-09-07T07:46:33.5665944Z * [new branch] gh/coconutruben/49/base -> origin/gh/coconutruben/49/base 2025-09-07T07:46:33.5666199Z * [new branch] gh/coconutruben/49/head -> origin/gh/coconutruben/49/head 2025-09-07T07:46:33.5666441Z * [new branch] gh/coconutruben/49/orig -> origin/gh/coconutruben/49/orig 2025-09-07T07:46:33.5666692Z * [new branch] gh/coconutruben/50/base -> origin/gh/coconutruben/50/base 2025-09-07T07:46:33.5666937Z * [new branch] gh/coconutruben/50/head -> origin/gh/coconutruben/50/head 2025-09-07T07:46:33.5667268Z * [new branch] gh/coconutruben/50/orig -> origin/gh/coconutruben/50/orig 2025-09-07T07:46:33.5667519Z * [new branch] gh/coconutruben/51/base -> origin/gh/coconutruben/51/base 2025-09-07T07:46:33.5667758Z * [new branch] gh/coconutruben/51/head -> origin/gh/coconutruben/51/head 2025-09-07T07:46:33.5668012Z * [new branch] gh/coconutruben/51/orig -> origin/gh/coconutruben/51/orig 2025-09-07T07:46:33.5668250Z * [new branch] gh/coconutruben/52/base -> origin/gh/coconutruben/52/base 2025-09-07T07:46:33.5668497Z * [new branch] gh/coconutruben/52/head -> origin/gh/coconutruben/52/head 2025-09-07T07:46:33.5668744Z * [new branch] gh/coconutruben/52/orig -> origin/gh/coconutruben/52/orig 2025-09-07T07:46:33.5668984Z * [new branch] gh/coconutruben/53/base -> origin/gh/coconutruben/53/base 2025-09-07T07:46:33.5669236Z * [new branch] gh/coconutruben/53/head -> origin/gh/coconutruben/53/head 2025-09-07T07:46:33.5669484Z * [new branch] gh/coconutruben/53/orig -> origin/gh/coconutruben/53/orig 2025-09-07T07:46:33.5669736Z * [new branch] gh/coconutruben/54/base -> origin/gh/coconutruben/54/base 2025-09-07T07:46:33.5669977Z * [new branch] gh/coconutruben/54/head -> origin/gh/coconutruben/54/head 2025-09-07T07:46:33.5670222Z * [new branch] gh/coconutruben/54/orig -> origin/gh/coconutruben/54/orig 2025-09-07T07:46:33.5670470Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-09-07T07:46:33.5670712Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-09-07T07:46:33.5670962Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-09-07T07:46:33.5671305Z * [new branch] gh/coconutruben/56/base -> origin/gh/coconutruben/56/base 2025-09-07T07:46:33.5671552Z * [new branch] gh/coconutruben/56/head -> origin/gh/coconutruben/56/head 2025-09-07T07:46:33.5671790Z * [new branch] gh/coconutruben/56/orig -> origin/gh/coconutruben/56/orig 2025-09-07T07:46:33.5672027Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-09-07T07:46:33.5672279Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-09-07T07:46:33.5672518Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-09-07T07:46:33.5672769Z * [new branch] gh/coconutruben/58/base -> origin/gh/coconutruben/58/base 2025-09-07T07:46:33.5673004Z * [new branch] gh/coconutruben/58/head -> origin/gh/coconutruben/58/head 2025-09-07T07:46:33.5673256Z * [new branch] gh/coconutruben/58/orig -> origin/gh/coconutruben/58/orig 2025-09-07T07:46:33.5673498Z * [new branch] gh/coconutruben/59/base -> origin/gh/coconutruben/59/base 2025-09-07T07:46:33.5673742Z * [new branch] gh/coconutruben/59/head -> origin/gh/coconutruben/59/head 2025-09-07T07:46:33.5673992Z * [new branch] gh/coconutruben/59/orig -> origin/gh/coconutruben/59/orig 2025-09-07T07:46:33.5674231Z * [new branch] gh/coconutruben/60/base -> origin/gh/coconutruben/60/base 2025-09-07T07:46:33.5674484Z * [new branch] gh/coconutruben/60/head -> origin/gh/coconutruben/60/head 2025-09-07T07:46:33.5674721Z * [new branch] gh/coconutruben/60/orig -> origin/gh/coconutruben/60/orig 2025-09-07T07:46:33.5674961Z * [new branch] gh/coconutruben/61/base -> origin/gh/coconutruben/61/base 2025-09-07T07:46:33.5675209Z * [new branch] gh/coconutruben/61/head -> origin/gh/coconutruben/61/head 2025-09-07T07:46:33.5675452Z * [new branch] gh/coconutruben/61/orig -> origin/gh/coconutruben/61/orig 2025-09-07T07:46:33.5675704Z * [new branch] gh/coconutruben/62/base -> origin/gh/coconutruben/62/base 2025-09-07T07:46:33.5676028Z * [new branch] gh/coconutruben/62/head -> origin/gh/coconutruben/62/head 2025-09-07T07:46:33.5676280Z * [new branch] gh/coconutruben/62/orig -> origin/gh/coconutruben/62/orig 2025-09-07T07:46:33.5676523Z * [new branch] gh/coconutruben/63/base -> origin/gh/coconutruben/63/base 2025-09-07T07:46:33.5676760Z * [new branch] gh/coconutruben/63/head -> origin/gh/coconutruben/63/head 2025-09-07T07:46:33.5677010Z * [new branch] gh/coconutruben/63/orig -> origin/gh/coconutruben/63/orig 2025-09-07T07:46:33.5677252Z * [new branch] gh/coconutruben/64/base -> origin/gh/coconutruben/64/base 2025-09-07T07:46:33.5677498Z * [new branch] gh/coconutruben/64/head -> origin/gh/coconutruben/64/head 2025-09-07T07:46:33.5677744Z * [new branch] gh/coconutruben/64/orig -> origin/gh/coconutruben/64/orig 2025-09-07T07:46:33.5678000Z * [new branch] gh/coconutruben/65/base -> origin/gh/coconutruben/65/base 2025-09-07T07:46:33.5678242Z * [new branch] gh/coconutruben/65/head -> origin/gh/coconutruben/65/head 2025-09-07T07:46:33.5678477Z * [new branch] gh/coconutruben/65/orig -> origin/gh/coconutruben/65/orig 2025-09-07T07:46:33.5678728Z * [new branch] gh/coconutruben/66/base -> origin/gh/coconutruben/66/base 2025-09-07T07:46:33.5678967Z * [new branch] gh/coconutruben/66/head -> origin/gh/coconutruben/66/head 2025-09-07T07:46:33.5679216Z * [new branch] gh/coconutruben/66/orig -> origin/gh/coconutruben/66/orig 2025-09-07T07:46:33.5679478Z * [new branch] gh/codingwithsurya/12/base -> origin/gh/codingwithsurya/12/base 2025-09-07T07:46:33.5679832Z * [new branch] gh/codingwithsurya/12/head -> origin/gh/codingwithsurya/12/head 2025-09-07T07:46:33.5680101Z * [new branch] gh/codingwithsurya/12/orig -> origin/gh/codingwithsurya/12/orig 2025-09-07T07:46:33.5680368Z * [new branch] gh/codingwithsurya/14/base -> origin/gh/codingwithsurya/14/base 2025-09-07T07:46:33.5680635Z * [new branch] gh/codingwithsurya/14/head -> origin/gh/codingwithsurya/14/head 2025-09-07T07:46:33.5680898Z * [new branch] gh/codingwithsurya/14/orig -> origin/gh/codingwithsurya/14/orig 2025-09-07T07:46:33.5681168Z * [new branch] gh/codingwithsurya/15/base -> origin/gh/codingwithsurya/15/base 2025-09-07T07:46:33.5681431Z * [new branch] gh/codingwithsurya/15/head -> origin/gh/codingwithsurya/15/head 2025-09-07T07:46:33.5681688Z * [new branch] gh/codingwithsurya/15/orig -> origin/gh/codingwithsurya/15/orig 2025-09-07T07:46:33.5681960Z * [new branch] gh/codingwithsurya/16/base -> origin/gh/codingwithsurya/16/base 2025-09-07T07:46:33.5682226Z * [new branch] gh/codingwithsurya/16/head -> origin/gh/codingwithsurya/16/head 2025-09-07T07:46:33.5682497Z * [new branch] gh/codingwithsurya/16/orig -> origin/gh/codingwithsurya/16/orig 2025-09-07T07:46:33.5682760Z * [new branch] gh/codingwithsurya/17/base -> origin/gh/codingwithsurya/17/base 2025-09-07T07:46:33.5683163Z * [new branch] gh/codingwithsurya/17/head -> origin/gh/codingwithsurya/17/head 2025-09-07T07:46:33.5683432Z * [new branch] gh/codingwithsurya/17/orig -> origin/gh/codingwithsurya/17/orig 2025-09-07T07:46:33.5683696Z * [new branch] gh/codingwithsurya/18/base -> origin/gh/codingwithsurya/18/base 2025-09-07T07:46:33.5683969Z * [new branch] gh/codingwithsurya/18/head -> origin/gh/codingwithsurya/18/head 2025-09-07T07:46:33.5684225Z * [new branch] gh/codingwithsurya/18/orig -> origin/gh/codingwithsurya/18/orig 2025-09-07T07:46:33.5684502Z * [new branch] gh/codingwithsurya/19/base -> origin/gh/codingwithsurya/19/base 2025-09-07T07:46:33.5684893Z * [new branch] gh/codingwithsurya/19/head -> origin/gh/codingwithsurya/19/head 2025-09-07T07:46:33.5685173Z * [new branch] gh/codingwithsurya/19/orig -> origin/gh/codingwithsurya/19/orig 2025-09-07T07:46:33.5685437Z * [new branch] gh/codingwithsurya/20/base -> origin/gh/codingwithsurya/20/base 2025-09-07T07:46:33.5685699Z * [new branch] gh/codingwithsurya/20/head -> origin/gh/codingwithsurya/20/head 2025-09-07T07:46:33.5685971Z * [new branch] gh/codingwithsurya/20/orig -> origin/gh/codingwithsurya/20/orig 2025-09-07T07:46:33.5686231Z * [new branch] gh/codingwithsurya/21/base -> origin/gh/codingwithsurya/21/base 2025-09-07T07:46:33.5686501Z * [new branch] gh/codingwithsurya/21/head -> origin/gh/codingwithsurya/21/head 2025-09-07T07:46:33.5686766Z * [new branch] gh/codingwithsurya/21/orig -> origin/gh/codingwithsurya/21/orig 2025-09-07T07:46:33.5687004Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-09-07T07:46:33.5687245Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-09-07T07:46:33.5687476Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-09-07T07:46:33.5687713Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-09-07T07:46:33.5687943Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-09-07T07:46:33.5688182Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-09-07T07:46:33.5688408Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-09-07T07:46:33.5688637Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-09-07T07:46:33.5689006Z * [new branch] gh/davidberard98/382/base -> origin/gh/davidberard98/382/base 2025-09-07T07:46:33.5689266Z * [new branch] gh/davidberard98/382/head -> origin/gh/davidberard98/382/head 2025-09-07T07:46:33.5689528Z * [new branch] gh/davidberard98/382/orig -> origin/gh/davidberard98/382/orig 2025-09-07T07:46:33.5689776Z * [new branch] gh/davidberard98/386/base -> origin/gh/davidberard98/386/base 2025-09-07T07:46:33.5690038Z * [new branch] gh/davidberard98/386/head -> origin/gh/davidberard98/386/head 2025-09-07T07:46:33.5690287Z * [new branch] gh/davidberard98/386/orig -> origin/gh/davidberard98/386/orig 2025-09-07T07:46:33.5690538Z * [new branch] gh/davidberard98/391/base -> origin/gh/davidberard98/391/base 2025-09-07T07:46:33.5690799Z * [new branch] gh/davidberard98/391/head -> origin/gh/davidberard98/391/head 2025-09-07T07:46:33.5691048Z * [new branch] gh/davidberard98/391/orig -> origin/gh/davidberard98/391/orig 2025-09-07T07:46:33.5691311Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-09-07T07:46:33.5691562Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-09-07T07:46:33.5691810Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-09-07T07:46:33.5692067Z * [new branch] gh/davidberard98/394/base -> origin/gh/davidberard98/394/base 2025-09-07T07:46:33.5692319Z * [new branch] gh/davidberard98/394/head -> origin/gh/davidberard98/394/head 2025-09-07T07:46:33.5692582Z * [new branch] gh/davidberard98/394/orig -> origin/gh/davidberard98/394/orig 2025-09-07T07:46:33.5692829Z * [new branch] gh/davidberard98/396/base -> origin/gh/davidberard98/396/base 2025-09-07T07:46:33.5693085Z * [new branch] gh/davidberard98/396/head -> origin/gh/davidberard98/396/head 2025-09-07T07:46:33.5693336Z * [new branch] gh/davidberard98/396/orig -> origin/gh/davidberard98/396/orig 2025-09-07T07:46:33.5693678Z * [new branch] gh/davidberard98/397/base -> origin/gh/davidberard98/397/base 2025-09-07T07:46:33.5693944Z * [new branch] gh/davidberard98/397/head -> origin/gh/davidberard98/397/head 2025-09-07T07:46:33.5694190Z * [new branch] gh/davidberard98/397/orig -> origin/gh/davidberard98/397/orig 2025-09-07T07:46:33.5694449Z * [new branch] gh/davidberard98/398/base -> origin/gh/davidberard98/398/base 2025-09-07T07:46:33.5694700Z * [new branch] gh/davidberard98/398/head -> origin/gh/davidberard98/398/head 2025-09-07T07:46:33.5694956Z * [new branch] gh/davidberard98/398/orig -> origin/gh/davidberard98/398/orig 2025-09-07T07:46:33.5695205Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-09-07T07:46:33.5695451Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-09-07T07:46:33.5695706Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-09-07T07:46:33.5695958Z * [new branch] gh/davidberard98/400/base -> origin/gh/davidberard98/400/base 2025-09-07T07:46:33.5696215Z * [new branch] gh/davidberard98/400/head -> origin/gh/davidberard98/400/head 2025-09-07T07:46:33.5696464Z * [new branch] gh/davidberard98/400/orig -> origin/gh/davidberard98/400/orig 2025-09-07T07:46:33.5696714Z * [new branch] gh/davidberard98/401/base -> origin/gh/davidberard98/401/base 2025-09-07T07:46:33.5696969Z * [new branch] gh/davidberard98/401/head -> origin/gh/davidberard98/401/head 2025-09-07T07:46:33.5697220Z * [new branch] gh/davidberard98/401/orig -> origin/gh/davidberard98/401/orig 2025-09-07T07:46:33.5697711Z * [new branch] gh/davidberard98/402/base -> origin/gh/davidberard98/402/base 2025-09-07T07:46:33.5697964Z * [new branch] gh/davidberard98/402/head -> origin/gh/davidberard98/402/head 2025-09-07T07:46:33.5698227Z * [new branch] gh/davidberard98/402/orig -> origin/gh/davidberard98/402/orig 2025-09-07T07:46:33.5698478Z * [new branch] gh/davidberard98/403/base -> origin/gh/davidberard98/403/base 2025-09-07T07:46:33.5698724Z * [new branch] gh/davidberard98/403/head -> origin/gh/davidberard98/403/head 2025-09-07T07:46:33.5698983Z * [new branch] gh/davidberard98/403/orig -> origin/gh/davidberard98/403/orig 2025-09-07T07:46:33.5699232Z * [new branch] gh/davidberard98/404/base -> origin/gh/davidberard98/404/base 2025-09-07T07:46:33.5699485Z * [new branch] gh/davidberard98/404/head -> origin/gh/davidberard98/404/head 2025-09-07T07:46:33.5699736Z * [new branch] gh/davidberard98/404/orig -> origin/gh/davidberard98/404/orig 2025-09-07T07:46:33.5700004Z * [new branch] gh/davidberard98/405/base -> origin/gh/davidberard98/405/base 2025-09-07T07:46:33.5700256Z * [new branch] gh/davidberard98/405/head -> origin/gh/davidberard98/405/head 2025-09-07T07:46:33.5700503Z * [new branch] gh/davidberard98/405/orig -> origin/gh/davidberard98/405/orig 2025-09-07T07:46:33.5700761Z * [new branch] gh/davidberard98/406/base -> origin/gh/davidberard98/406/base 2025-09-07T07:46:33.5701008Z * [new branch] gh/davidberard98/406/head -> origin/gh/davidberard98/406/head 2025-09-07T07:46:33.5701274Z * [new branch] gh/davidberard98/406/orig -> origin/gh/davidberard98/406/orig 2025-09-07T07:46:33.5701522Z * [new branch] gh/davidberard98/407/base -> origin/gh/davidberard98/407/base 2025-09-07T07:46:33.5701785Z * [new branch] gh/davidberard98/407/head -> origin/gh/davidberard98/407/head 2025-09-07T07:46:33.5702035Z * [new branch] gh/davidberard98/407/orig -> origin/gh/davidberard98/407/orig 2025-09-07T07:46:33.5702284Z * [new branch] gh/davidberard98/408/base -> origin/gh/davidberard98/408/base 2025-09-07T07:46:33.5702629Z * [new branch] gh/davidberard98/408/head -> origin/gh/davidberard98/408/head 2025-09-07T07:46:33.5702889Z * [new branch] gh/davidberard98/408/orig -> origin/gh/davidberard98/408/orig 2025-09-07T07:46:33.5703153Z * [new branch] gh/davidberard98/409/base -> origin/gh/davidberard98/409/base 2025-09-07T07:46:33.5703405Z * [new branch] gh/davidberard98/409/head -> origin/gh/davidberard98/409/head 2025-09-07T07:46:33.5703656Z * [new branch] gh/davidberard98/409/orig -> origin/gh/davidberard98/409/orig 2025-09-07T07:46:33.5703907Z * [new branch] gh/desertfire/594/base -> origin/gh/desertfire/594/base 2025-09-07T07:46:33.5704147Z * [new branch] gh/desertfire/594/head -> origin/gh/desertfire/594/head 2025-09-07T07:46:33.5704402Z * [new branch] gh/desertfire/594/orig -> origin/gh/desertfire/594/orig 2025-09-07T07:46:33.5704642Z * [new branch] gh/desertfire/595/base -> origin/gh/desertfire/595/base 2025-09-07T07:46:33.5704895Z * [new branch] gh/desertfire/595/head -> origin/gh/desertfire/595/head 2025-09-07T07:46:33.5705133Z * [new branch] gh/desertfire/595/orig -> origin/gh/desertfire/595/orig 2025-09-07T07:46:33.5705370Z * [new branch] gh/desertfire/597/base -> origin/gh/desertfire/597/base 2025-09-07T07:46:33.5705622Z * [new branch] gh/desertfire/597/head -> origin/gh/desertfire/597/head 2025-09-07T07:46:33.5705859Z * [new branch] gh/desertfire/597/orig -> origin/gh/desertfire/597/orig 2025-09-07T07:46:33.5706091Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-09-07T07:46:33.5706405Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-09-07T07:46:33.5706645Z * [new branch] gh/drisspg/149/base -> origin/gh/drisspg/149/base 2025-09-07T07:46:33.5706875Z * [new branch] gh/drisspg/149/head -> origin/gh/drisspg/149/head 2025-09-07T07:46:33.5707098Z * [new branch] gh/drisspg/149/orig -> origin/gh/drisspg/149/orig 2025-09-07T07:46:33.5707334Z * [new branch] gh/drisspg/159/base -> origin/gh/drisspg/159/base 2025-09-07T07:46:33.5707552Z * [new branch] gh/drisspg/159/head -> origin/gh/drisspg/159/head 2025-09-07T07:46:33.5707786Z * [new branch] gh/drisspg/159/orig -> origin/gh/drisspg/159/orig 2025-09-07T07:46:33.5708005Z * [new branch] gh/drisspg/166/base -> origin/gh/drisspg/166/base 2025-09-07T07:46:33.5708225Z * [new branch] gh/drisspg/166/head -> origin/gh/drisspg/166/head 2025-09-07T07:46:33.5708462Z * [new branch] gh/drisspg/166/orig -> origin/gh/drisspg/166/orig 2025-09-07T07:46:33.5708681Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-09-07T07:46:33.5708921Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-09-07T07:46:33.5709139Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-09-07T07:46:33.5709372Z * [new branch] gh/drisspg/173/base -> origin/gh/drisspg/173/base 2025-09-07T07:46:33.5709594Z * [new branch] gh/drisspg/173/head -> origin/gh/drisspg/173/head 2025-09-07T07:46:33.5709815Z * [new branch] gh/drisspg/173/orig -> origin/gh/drisspg/173/orig 2025-09-07T07:46:33.5710046Z * [new branch] gh/drisspg/177/base -> origin/gh/drisspg/177/base 2025-09-07T07:46:33.5710267Z * [new branch] gh/drisspg/177/head -> origin/gh/drisspg/177/head 2025-09-07T07:46:33.5710500Z * [new branch] gh/drisspg/177/orig -> origin/gh/drisspg/177/orig 2025-09-07T07:46:33.5710805Z * [new branch] gh/drisspg/178/base -> origin/gh/drisspg/178/base 2025-09-07T07:46:33.5711030Z * [new branch] gh/drisspg/178/head -> origin/gh/drisspg/178/head 2025-09-07T07:46:33.5711264Z * [new branch] gh/drisspg/178/orig -> origin/gh/drisspg/178/orig 2025-09-07T07:46:33.5711481Z * [new branch] gh/drisspg/180/base -> origin/gh/drisspg/180/base 2025-09-07T07:46:33.5711718Z * [new branch] gh/drisspg/180/head -> origin/gh/drisspg/180/head 2025-09-07T07:46:33.5711938Z * [new branch] gh/drisspg/180/orig -> origin/gh/drisspg/180/orig 2025-09-07T07:46:33.5712175Z * [new branch] gh/drisspg/181/base -> origin/gh/drisspg/181/base 2025-09-07T07:46:33.5712398Z * [new branch] gh/drisspg/181/head -> origin/gh/drisspg/181/head 2025-09-07T07:46:33.5712624Z * [new branch] gh/drisspg/181/orig -> origin/gh/drisspg/181/orig 2025-09-07T07:46:33.5712864Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-09-07T07:46:33.5713082Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-09-07T07:46:33.5713315Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-09-07T07:46:33.5713535Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-09-07T07:46:33.5713770Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-09-07T07:46:33.5713995Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-09-07T07:46:33.5714215Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-09-07T07:46:33.5714451Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-09-07T07:46:33.5714763Z * [new branch] gh/drisspg/186/base -> origin/gh/drisspg/186/base 2025-09-07T07:46:33.5714998Z * [new branch] gh/drisspg/186/head -> origin/gh/drisspg/186/head 2025-09-07T07:46:33.5715220Z * [new branch] gh/drisspg/186/orig -> origin/gh/drisspg/186/orig 2025-09-07T07:46:33.5715441Z * [new branch] gh/drisspg/187/base -> origin/gh/drisspg/187/base 2025-09-07T07:46:33.5715667Z * [new branch] gh/drisspg/187/head -> origin/gh/drisspg/187/head 2025-09-07T07:46:33.5715884Z * [new branch] gh/drisspg/187/orig -> origin/gh/drisspg/187/orig 2025-09-07T07:46:33.5716116Z * [new branch] gh/drisspg/188/base -> origin/gh/drisspg/188/base 2025-09-07T07:46:33.5716334Z * [new branch] gh/drisspg/188/head -> origin/gh/drisspg/188/head 2025-09-07T07:46:33.5716569Z * [new branch] gh/drisspg/188/orig -> origin/gh/drisspg/188/orig 2025-09-07T07:46:33.5716787Z * [new branch] gh/drisspg/189/base -> origin/gh/drisspg/189/base 2025-09-07T07:46:33.5717012Z * [new branch] gh/drisspg/189/head -> origin/gh/drisspg/189/head 2025-09-07T07:46:33.5717242Z * [new branch] gh/drisspg/189/orig -> origin/gh/drisspg/189/orig 2025-09-07T07:46:33.5717463Z * [new branch] gh/drisspg/190/base -> origin/gh/drisspg/190/base 2025-09-07T07:46:33.5717693Z * [new branch] gh/drisspg/190/head -> origin/gh/drisspg/190/head 2025-09-07T07:46:33.5717913Z * [new branch] gh/drisspg/190/orig -> origin/gh/drisspg/190/orig 2025-09-07T07:46:33.5718131Z * [new branch] gh/drisspg/191/base -> origin/gh/drisspg/191/base 2025-09-07T07:46:33.5718361Z * [new branch] gh/drisspg/191/head -> origin/gh/drisspg/191/head 2025-09-07T07:46:33.5718584Z * [new branch] gh/drisspg/191/orig -> origin/gh/drisspg/191/orig 2025-09-07T07:46:33.5718818Z * [new branch] gh/drisspg/192/base -> origin/gh/drisspg/192/base 2025-09-07T07:46:33.5719141Z * [new branch] gh/drisspg/192/head -> origin/gh/drisspg/192/head 2025-09-07T07:46:33.5719377Z * [new branch] gh/drisspg/192/orig -> origin/gh/drisspg/192/orig 2025-09-07T07:46:33.5719600Z * [new branch] gh/drisspg/193/base -> origin/gh/drisspg/193/base 2025-09-07T07:46:33.5719818Z * [new branch] gh/drisspg/193/head -> origin/gh/drisspg/193/head 2025-09-07T07:46:33.5720049Z * [new branch] gh/drisspg/193/orig -> origin/gh/drisspg/193/orig 2025-09-07T07:46:33.5720269Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-09-07T07:46:33.5720499Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-09-07T07:46:33.5720721Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-09-07T07:46:33.5720951Z * [new branch] gh/drisspg/195/base -> origin/gh/drisspg/195/base 2025-09-07T07:46:33.5721174Z * [new branch] gh/drisspg/195/head -> origin/gh/drisspg/195/head 2025-09-07T07:46:33.5721393Z * [new branch] gh/drisspg/195/orig -> origin/gh/drisspg/195/orig 2025-09-07T07:46:33.5721626Z * [new branch] gh/drisspg/196/base -> origin/gh/drisspg/196/base 2025-09-07T07:46:33.5721844Z * [new branch] gh/drisspg/196/head -> origin/gh/drisspg/196/head 2025-09-07T07:46:33.5722078Z * [new branch] gh/drisspg/196/orig -> origin/gh/drisspg/196/orig 2025-09-07T07:46:33.5722297Z * [new branch] gh/drisspg/197/base -> origin/gh/drisspg/197/base 2025-09-07T07:46:33.5722515Z * [new branch] gh/drisspg/197/head -> origin/gh/drisspg/197/head 2025-09-07T07:46:33.5723011Z * [new branch] gh/drisspg/197/orig -> origin/gh/drisspg/197/orig 2025-09-07T07:46:33.5723236Z * [new branch] gh/drisspg/198/base -> origin/gh/drisspg/198/base 2025-09-07T07:46:33.5723468Z * [new branch] gh/drisspg/198/head -> origin/gh/drisspg/198/head 2025-09-07T07:46:33.5723687Z * [new branch] gh/drisspg/198/orig -> origin/gh/drisspg/198/orig 2025-09-07T07:46:33.5723922Z * [new branch] gh/drisspg/199/base -> origin/gh/drisspg/199/base 2025-09-07T07:46:33.5724142Z * [new branch] gh/drisspg/199/head -> origin/gh/drisspg/199/head 2025-09-07T07:46:33.5724360Z * [new branch] gh/drisspg/199/orig -> origin/gh/drisspg/199/orig 2025-09-07T07:46:33.5724594Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-09-07T07:46:33.5724811Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-09-07T07:46:33.5725057Z * [new branch] gh/eellison/784/base -> origin/gh/eellison/784/base 2025-09-07T07:46:33.5725287Z * [new branch] gh/eellison/784/head -> origin/gh/eellison/784/head 2025-09-07T07:46:33.5725523Z * [new branch] gh/eellison/784/orig -> origin/gh/eellison/784/orig 2025-09-07T07:46:33.5725749Z * [new branch] gh/eellison/785/base -> origin/gh/eellison/785/base 2025-09-07T07:46:33.5725974Z * [new branch] gh/eellison/785/head -> origin/gh/eellison/785/head 2025-09-07T07:46:33.5726211Z * [new branch] gh/eellison/785/orig -> origin/gh/eellison/785/orig 2025-09-07T07:46:33.5726436Z * [new branch] gh/eellison/789/base -> origin/gh/eellison/789/base 2025-09-07T07:46:33.5726672Z * [new branch] gh/eellison/789/head -> origin/gh/eellison/789/head 2025-09-07T07:46:33.5726896Z * [new branch] gh/eellison/789/orig -> origin/gh/eellison/789/orig 2025-09-07T07:46:33.5727125Z * [new branch] gh/eellison/800/base -> origin/gh/eellison/800/base 2025-09-07T07:46:33.5727506Z * [new branch] gh/eellison/800/head -> origin/gh/eellison/800/head 2025-09-07T07:46:33.5727743Z * [new branch] gh/eellison/800/orig -> origin/gh/eellison/800/orig 2025-09-07T07:46:33.5727981Z * [new branch] gh/eellison/801/base -> origin/gh/eellison/801/base 2025-09-07T07:46:33.5728206Z * [new branch] gh/eellison/801/head -> origin/gh/eellison/801/head 2025-09-07T07:46:33.5728445Z * [new branch] gh/eellison/801/orig -> origin/gh/eellison/801/orig 2025-09-07T07:46:33.5728670Z * [new branch] gh/eellison/802/base -> origin/gh/eellison/802/base 2025-09-07T07:46:33.5728894Z * [new branch] gh/eellison/802/head -> origin/gh/eellison/802/head 2025-09-07T07:46:33.5729162Z * [new branch] gh/eellison/802/orig -> origin/gh/eellison/802/orig 2025-09-07T07:46:33.5729387Z * [new branch] gh/eellison/805/base -> origin/gh/eellison/805/base 2025-09-07T07:46:33.5729630Z * [new branch] gh/eellison/805/head -> origin/gh/eellison/805/head 2025-09-07T07:46:33.5729855Z * [new branch] gh/eellison/805/orig -> origin/gh/eellison/805/orig 2025-09-07T07:46:33.5730078Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-09-07T07:46:33.5730317Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-09-07T07:46:33.5730542Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-09-07T07:46:33.5730781Z * [new branch] gh/eellison/809/base -> origin/gh/eellison/809/base 2025-09-07T07:46:33.5731005Z * [new branch] gh/eellison/809/head -> origin/gh/eellison/809/head 2025-09-07T07:46:33.5731448Z * [new branch] gh/eellison/809/orig -> origin/gh/eellison/809/orig 2025-09-07T07:46:33.5731676Z * [new branch] gh/eellison/813/base -> origin/gh/eellison/813/base 2025-09-07T07:46:33.5731899Z * [new branch] gh/eellison/813/head -> origin/gh/eellison/813/head 2025-09-07T07:46:33.5732137Z * [new branch] gh/eellison/813/orig -> origin/gh/eellison/813/orig 2025-09-07T07:46:33.5732364Z * [new branch] gh/eellison/814/base -> origin/gh/eellison/814/base 2025-09-07T07:46:33.5732603Z * [new branch] gh/eellison/814/head -> origin/gh/eellison/814/head 2025-09-07T07:46:33.5732826Z * [new branch] gh/eellison/814/orig -> origin/gh/eellison/814/orig 2025-09-07T07:46:33.5733064Z * [new branch] gh/eellison/815/base -> origin/gh/eellison/815/base 2025-09-07T07:46:33.5733287Z * [new branch] gh/eellison/815/head -> origin/gh/eellison/815/head 2025-09-07T07:46:33.5733514Z * [new branch] gh/eellison/815/orig -> origin/gh/eellison/815/orig 2025-09-07T07:46:33.5733753Z * [new branch] gh/eellison/816/base -> origin/gh/eellison/816/base 2025-09-07T07:46:33.5733979Z * [new branch] gh/eellison/816/head -> origin/gh/eellison/816/head 2025-09-07T07:46:33.5734213Z * [new branch] gh/eellison/816/orig -> origin/gh/eellison/816/orig 2025-09-07T07:46:33.5734442Z * [new branch] gh/eellison/817/base -> origin/gh/eellison/817/base 2025-09-07T07:46:33.5734664Z * [new branch] gh/eellison/817/head -> origin/gh/eellison/817/head 2025-09-07T07:46:33.5734898Z * [new branch] gh/eellison/817/orig -> origin/gh/eellison/817/orig 2025-09-07T07:46:33.5735119Z * [new branch] gh/eellison/818/base -> origin/gh/eellison/818/base 2025-09-07T07:46:33.5735356Z * [new branch] gh/eellison/818/head -> origin/gh/eellison/818/head 2025-09-07T07:46:33.5735583Z * [new branch] gh/eellison/818/orig -> origin/gh/eellison/818/orig 2025-09-07T07:46:33.5735913Z * [new branch] gh/eellison/819/base -> origin/gh/eellison/819/base 2025-09-07T07:46:33.5736141Z * [new branch] gh/eellison/819/head -> origin/gh/eellison/819/head 2025-09-07T07:46:33.5736365Z * [new branch] gh/eellison/819/orig -> origin/gh/eellison/819/orig 2025-09-07T07:46:33.5736600Z * [new branch] gh/eellison/820/base -> origin/gh/eellison/820/base 2025-09-07T07:46:33.5736821Z * [new branch] gh/eellison/820/head -> origin/gh/eellison/820/head 2025-09-07T07:46:33.5737057Z * [new branch] gh/eellison/820/orig -> origin/gh/eellison/820/orig 2025-09-07T07:46:33.5737280Z * [new branch] gh/eellison/821/base -> origin/gh/eellison/821/base 2025-09-07T07:46:33.5737588Z * [new branch] gh/eellison/821/head -> origin/gh/eellison/821/head 2025-09-07T07:46:33.5737827Z * [new branch] gh/eellison/821/orig -> origin/gh/eellison/821/orig 2025-09-07T07:46:33.5738059Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-09-07T07:46:33.5738295Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-09-07T07:46:33.5738521Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-09-07T07:46:33.5738761Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-09-07T07:46:33.5738985Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-09-07T07:46:33.5739209Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-09-07T07:46:33.5739429Z * [new branch] gh/etaf/132/base -> origin/gh/etaf/132/base 2025-09-07T07:46:33.5739737Z * [new branch] gh/etaf/132/head -> origin/gh/etaf/132/head 2025-09-07T07:46:33.5739957Z * [new branch] gh/etaf/132/orig -> origin/gh/etaf/132/orig 2025-09-07T07:46:33.5740166Z * [new branch] gh/etaf/138/base -> origin/gh/etaf/138/base 2025-09-07T07:46:33.5740382Z * [new branch] gh/etaf/138/head -> origin/gh/etaf/138/head 2025-09-07T07:46:33.5740586Z * [new branch] gh/etaf/138/orig -> origin/gh/etaf/138/orig 2025-09-07T07:46:33.5740790Z * [new branch] gh/etaf/140/base -> origin/gh/etaf/140/base 2025-09-07T07:46:33.5741005Z * [new branch] gh/etaf/140/head -> origin/gh/etaf/140/head 2025-09-07T07:46:33.5741209Z * [new branch] gh/etaf/140/orig -> origin/gh/etaf/140/orig 2025-09-07T07:46:33.5741426Z * [new branch] gh/etaf/143/base -> origin/gh/etaf/143/base 2025-09-07T07:46:33.5741633Z * [new branch] gh/etaf/143/head -> origin/gh/etaf/143/head 2025-09-07T07:46:33.5741835Z * [new branch] gh/etaf/143/orig -> origin/gh/etaf/143/orig 2025-09-07T07:46:33.5742059Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-09-07T07:46:33.5742265Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-09-07T07:46:33.5742483Z * [new branch] gh/etaf/151/base -> origin/gh/etaf/151/base 2025-09-07T07:46:33.5742686Z * [new branch] gh/etaf/151/head -> origin/gh/etaf/151/head 2025-09-07T07:46:33.5742902Z * [new branch] gh/etaf/151/orig -> origin/gh/etaf/151/orig 2025-09-07T07:46:33.5743105Z * [new branch] gh/etaf/152/base -> origin/gh/etaf/152/base 2025-09-07T07:46:33.5743308Z * [new branch] gh/etaf/152/head -> origin/gh/etaf/152/head 2025-09-07T07:46:33.5743526Z * [new branch] gh/etaf/152/orig -> origin/gh/etaf/152/orig 2025-09-07T07:46:33.5743729Z * [new branch] gh/etaf/153/base -> origin/gh/etaf/153/base 2025-09-07T07:46:33.5744042Z * [new branch] gh/etaf/153/head -> origin/gh/etaf/153/head 2025-09-07T07:46:33.5744249Z * [new branch] gh/etaf/153/orig -> origin/gh/etaf/153/orig 2025-09-07T07:46:33.5744452Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-09-07T07:46:33.5744667Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-09-07T07:46:33.5744871Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-09-07T07:46:33.5745086Z * [new branch] gh/etaf/155/base -> origin/gh/etaf/155/base 2025-09-07T07:46:33.5745289Z * [new branch] gh/etaf/155/head -> origin/gh/etaf/155/head 2025-09-07T07:46:33.5745508Z * [new branch] gh/etaf/155/orig -> origin/gh/etaf/155/orig 2025-09-07T07:46:33.5745713Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-09-07T07:46:33.5745919Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-09-07T07:46:33.5746140Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-09-07T07:46:33.5746341Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-09-07T07:46:33.5746557Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-09-07T07:46:33.5746760Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-09-07T07:46:33.5746962Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-09-07T07:46:33.5747173Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-09-07T07:46:33.5747472Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-09-07T07:46:33.5747686Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-09-07T07:46:33.5747887Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-09-07T07:46:33.5748106Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-09-07T07:46:33.5748308Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-09-07T07:46:33.5748510Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-09-07T07:46:33.5748723Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-09-07T07:46:33.5748928Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-09-07T07:46:33.5749140Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-09-07T07:46:33.5749349Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-09-07T07:46:33.5749549Z * [new branch] gh/etaf/162/base -> origin/gh/etaf/162/base 2025-09-07T07:46:33.5749764Z * [new branch] gh/etaf/162/head -> origin/gh/etaf/162/head 2025-09-07T07:46:33.5749970Z * [new branch] gh/etaf/162/orig -> origin/gh/etaf/162/orig 2025-09-07T07:46:33.5750190Z * [new branch] gh/etaf/163/base -> origin/gh/etaf/163/base 2025-09-07T07:46:33.5750392Z * [new branch] gh/etaf/163/head -> origin/gh/etaf/163/head 2025-09-07T07:46:33.5750605Z * [new branch] gh/etaf/163/orig -> origin/gh/etaf/163/orig 2025-09-07T07:46:33.5750808Z * [new branch] gh/etaf/164/base -> origin/gh/etaf/164/base 2025-09-07T07:46:33.5751012Z * [new branch] gh/etaf/164/head -> origin/gh/etaf/164/head 2025-09-07T07:46:33.5751229Z * [new branch] gh/etaf/164/orig -> origin/gh/etaf/164/orig 2025-09-07T07:46:33.5751430Z * [new branch] gh/etaf/165/base -> origin/gh/etaf/165/base 2025-09-07T07:46:33.5751739Z * [new branch] gh/etaf/165/orig -> origin/gh/etaf/165/orig 2025-09-07T07:46:33.5751944Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-09-07T07:46:33.5752150Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-09-07T07:46:33.5752369Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-09-07T07:46:33.5752569Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-09-07T07:46:33.5752784Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-09-07T07:46:33.5752984Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-09-07T07:46:33.5753195Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-09-07T07:46:33.5753399Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-09-07T07:46:33.5753603Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-09-07T07:46:33.5753815Z * [new branch] gh/etaf/169/base -> origin/gh/etaf/169/base 2025-09-07T07:46:33.5754016Z * [new branch] gh/etaf/169/head -> origin/gh/etaf/169/head 2025-09-07T07:46:33.5754231Z * [new branch] gh/etaf/169/orig -> origin/gh/etaf/169/orig 2025-09-07T07:46:33.5754481Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-09-07T07:46:33.5754734Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-09-07T07:46:33.5754977Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-09-07T07:46:33.5755334Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-09-07T07:46:33.5755584Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-09-07T07:46:33.5755827Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-09-07T07:46:33.5756073Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-09-07T07:46:33.5756312Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-09-07T07:46:33.5756531Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-09-07T07:46:33.5756759Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-09-07T07:46:33.5756977Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-09-07T07:46:33.5757206Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-09-07T07:46:33.5757424Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-09-07T07:46:33.5757653Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-09-07T07:46:33.5757869Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-09-07T07:46:33.5758086Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-09-07T07:46:33.5758312Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-09-07T07:46:33.5758523Z * [new branch] gh/ezyang/3074/base -> origin/gh/ezyang/3074/base 2025-09-07T07:46:33.5758748Z * [new branch] gh/ezyang/3074/head -> origin/gh/ezyang/3074/head 2025-09-07T07:46:33.5758961Z * [new branch] gh/ezyang/3074/orig -> origin/gh/ezyang/3074/orig 2025-09-07T07:46:33.5759175Z * [new branch] gh/ezyang/3088/base -> origin/gh/ezyang/3088/base 2025-09-07T07:46:33.5759404Z * [new branch] gh/ezyang/3088/head -> origin/gh/ezyang/3088/head 2025-09-07T07:46:33.5759726Z * [new branch] gh/ezyang/3088/orig -> origin/gh/ezyang/3088/orig 2025-09-07T07:46:33.5759958Z * [new branch] gh/ezyang/3092/base -> origin/gh/ezyang/3092/base 2025-09-07T07:46:33.5760172Z * [new branch] gh/ezyang/3092/head -> origin/gh/ezyang/3092/head 2025-09-07T07:46:33.5760396Z * [new branch] gh/ezyang/3092/orig -> origin/gh/ezyang/3092/orig 2025-09-07T07:46:33.5760614Z * [new branch] gh/ezyang/3103/base -> origin/gh/ezyang/3103/base 2025-09-07T07:46:33.5760829Z * [new branch] gh/ezyang/3103/head -> origin/gh/ezyang/3103/head 2025-09-07T07:46:33.5761058Z * [new branch] gh/ezyang/3103/orig -> origin/gh/ezyang/3103/orig 2025-09-07T07:46:33.5761269Z * [new branch] gh/ezyang/3105/base -> origin/gh/ezyang/3105/base 2025-09-07T07:46:33.5761500Z * [new branch] gh/ezyang/3105/head -> origin/gh/ezyang/3105/head 2025-09-07T07:46:33.5761719Z * [new branch] gh/ezyang/3105/orig -> origin/gh/ezyang/3105/orig 2025-09-07T07:46:33.5761947Z * [new branch] gh/ezyang/3114/base -> origin/gh/ezyang/3114/base 2025-09-07T07:46:33.5762164Z * [new branch] gh/ezyang/3114/head -> origin/gh/ezyang/3114/head 2025-09-07T07:46:33.5762376Z * [new branch] gh/ezyang/3114/orig -> origin/gh/ezyang/3114/orig 2025-09-07T07:46:33.5762599Z * [new branch] gh/ezyang/3116/base -> origin/gh/ezyang/3116/base 2025-09-07T07:46:33.5762815Z * [new branch] gh/ezyang/3116/head -> origin/gh/ezyang/3116/head 2025-09-07T07:46:33.5763176Z * [new branch] gh/ezyang/3116/orig -> origin/gh/ezyang/3116/orig 2025-09-07T07:46:33.5763394Z * [new branch] gh/ezyang/3120/base -> origin/gh/ezyang/3120/base 2025-09-07T07:46:33.5763737Z * [new branch] gh/ezyang/3120/head -> origin/gh/ezyang/3120/head 2025-09-07T07:46:33.5763965Z * [new branch] gh/ezyang/3120/orig -> origin/gh/ezyang/3120/orig 2025-09-07T07:46:33.5764180Z * [new branch] gh/ezyang/3122/base -> origin/gh/ezyang/3122/base 2025-09-07T07:46:33.5764408Z * [new branch] gh/ezyang/3122/head -> origin/gh/ezyang/3122/head 2025-09-07T07:46:33.5764624Z * [new branch] gh/ezyang/3122/orig -> origin/gh/ezyang/3122/orig 2025-09-07T07:46:33.5764852Z * [new branch] gh/ezyang/3123/base -> origin/gh/ezyang/3123/base 2025-09-07T07:46:33.5765067Z * [new branch] gh/ezyang/3123/head -> origin/gh/ezyang/3123/head 2025-09-07T07:46:33.5765283Z * [new branch] gh/ezyang/3123/orig -> origin/gh/ezyang/3123/orig 2025-09-07T07:46:33.5765516Z * [new branch] gh/ezyang/3125/base -> origin/gh/ezyang/3125/base 2025-09-07T07:46:33.5765733Z * [new branch] gh/ezyang/3125/head -> origin/gh/ezyang/3125/head 2025-09-07T07:46:33.5765962Z * [new branch] gh/ezyang/3125/orig -> origin/gh/ezyang/3125/orig 2025-09-07T07:46:33.5766178Z * [new branch] gh/ezyang/3126/base -> origin/gh/ezyang/3126/base 2025-09-07T07:46:33.5766394Z * [new branch] gh/ezyang/3126/head -> origin/gh/ezyang/3126/head 2025-09-07T07:46:33.5766620Z * [new branch] gh/ezyang/3126/orig -> origin/gh/ezyang/3126/orig 2025-09-07T07:46:33.5766834Z * [new branch] gh/ezyang/3127/base -> origin/gh/ezyang/3127/base 2025-09-07T07:46:33.5767062Z * [new branch] gh/ezyang/3127/head -> origin/gh/ezyang/3127/head 2025-09-07T07:46:33.5767276Z * [new branch] gh/ezyang/3127/orig -> origin/gh/ezyang/3127/orig 2025-09-07T07:46:33.5767507Z * [new branch] gh/ezyang/3128/base -> origin/gh/ezyang/3128/base 2025-09-07T07:46:33.5767724Z * [new branch] gh/ezyang/3128/head -> origin/gh/ezyang/3128/head 2025-09-07T07:46:33.5768057Z * [new branch] gh/ezyang/3128/orig -> origin/gh/ezyang/3128/orig 2025-09-07T07:46:33.5768288Z * [new branch] gh/ezyang/3129/base -> origin/gh/ezyang/3129/base 2025-09-07T07:46:33.5768501Z * [new branch] gh/ezyang/3129/head -> origin/gh/ezyang/3129/head 2025-09-07T07:46:33.5768727Z * [new branch] gh/ezyang/3129/orig -> origin/gh/ezyang/3129/orig 2025-09-07T07:46:33.5768943Z * [new branch] gh/ezyang/3130/base -> origin/gh/ezyang/3130/base 2025-09-07T07:46:33.5769165Z * [new branch] gh/ezyang/3130/head -> origin/gh/ezyang/3130/head 2025-09-07T07:46:33.5769379Z * [new branch] gh/ezyang/3130/orig -> origin/gh/ezyang/3130/orig 2025-09-07T07:46:33.5769598Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-09-07T07:46:33.5769828Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-09-07T07:46:33.5770046Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-09-07T07:46:33.5770271Z * [new branch] gh/ezyang/3132/base -> origin/gh/ezyang/3132/base 2025-09-07T07:46:33.5770485Z * [new branch] gh/ezyang/3132/head -> origin/gh/ezyang/3132/head 2025-09-07T07:46:33.5770700Z * [new branch] gh/ezyang/3132/orig -> origin/gh/ezyang/3132/orig 2025-09-07T07:46:33.5770925Z * [new branch] gh/ezyang/3133/base -> origin/gh/ezyang/3133/base 2025-09-07T07:46:33.5771139Z * [new branch] gh/ezyang/3133/head -> origin/gh/ezyang/3133/head 2025-09-07T07:46:33.5771366Z * [new branch] gh/ezyang/3133/orig -> origin/gh/ezyang/3133/orig 2025-09-07T07:46:33.5771674Z * [new branch] gh/ezyang/3134/base -> origin/gh/ezyang/3134/base 2025-09-07T07:46:33.5771902Z * [new branch] gh/ezyang/3134/head -> origin/gh/ezyang/3134/head 2025-09-07T07:46:33.5772120Z * [new branch] gh/ezyang/3134/orig -> origin/gh/ezyang/3134/orig 2025-09-07T07:46:33.5772333Z * [new branch] gh/ezyang/3135/base -> origin/gh/ezyang/3135/base 2025-09-07T07:46:33.5772559Z * [new branch] gh/ezyang/3135/head -> origin/gh/ezyang/3135/head 2025-09-07T07:46:33.5772775Z * [new branch] gh/ezyang/3135/orig -> origin/gh/ezyang/3135/orig 2025-09-07T07:46:33.5773003Z * [new branch] gh/ezyang/3136/base -> origin/gh/ezyang/3136/base 2025-09-07T07:46:33.5773218Z * [new branch] gh/ezyang/3136/head -> origin/gh/ezyang/3136/head 2025-09-07T07:46:33.5773430Z * [new branch] gh/ezyang/3136/orig -> origin/gh/ezyang/3136/orig 2025-09-07T07:46:33.5773663Z * [new branch] gh/ezyang/3137/base -> origin/gh/ezyang/3137/base 2025-09-07T07:46:33.5773879Z * [new branch] gh/ezyang/3137/head -> origin/gh/ezyang/3137/head 2025-09-07T07:46:33.5774105Z * [new branch] gh/ezyang/3137/orig -> origin/gh/ezyang/3137/orig 2025-09-07T07:46:33.5774319Z * [new branch] gh/ezyang/3138/base -> origin/gh/ezyang/3138/base 2025-09-07T07:46:33.5774545Z * [new branch] gh/ezyang/3138/head -> origin/gh/ezyang/3138/head 2025-09-07T07:46:33.5774761Z * [new branch] gh/ezyang/3138/orig -> origin/gh/ezyang/3138/orig 2025-09-07T07:46:33.5774970Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-09-07T07:46:33.5775195Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-09-07T07:46:33.5775409Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-09-07T07:46:33.5775637Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-09-07T07:46:33.5775939Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-09-07T07:46:33.5776173Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-09-07T07:46:33.5776390Z * [new branch] gh/ezyang/3141/base -> origin/gh/ezyang/3141/base 2025-09-07T07:46:33.5776605Z * [new branch] gh/ezyang/3141/head -> origin/gh/ezyang/3141/head 2025-09-07T07:46:33.5776830Z * [new branch] gh/ezyang/3141/orig -> origin/gh/ezyang/3141/orig 2025-09-07T07:46:33.5777044Z * [new branch] gh/ezyang/3142/base -> origin/gh/ezyang/3142/base 2025-09-07T07:46:33.5777278Z * [new branch] gh/ezyang/3142/head -> origin/gh/ezyang/3142/head 2025-09-07T07:46:33.5777576Z * [new branch] gh/ezyang/3142/orig -> origin/gh/ezyang/3142/orig 2025-09-07T07:46:33.5777793Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-09-07T07:46:33.5778029Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-09-07T07:46:33.5778243Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-09-07T07:46:33.5778478Z * [new branch] gh/fadara01/1/base -> origin/gh/fadara01/1/base 2025-09-07T07:46:33.5778696Z * [new branch] gh/fadara01/1/head -> origin/gh/fadara01/1/head 2025-09-07T07:46:33.5778920Z * [new branch] gh/fadara01/1/orig -> origin/gh/fadara01/1/orig 2025-09-07T07:46:33.5779133Z * [new branch] gh/fduwjj/171/base -> origin/gh/fduwjj/171/base 2025-09-07T07:46:33.5779343Z * [new branch] gh/fduwjj/171/head -> origin/gh/fduwjj/171/head 2025-09-07T07:46:33.5779673Z * [new branch] gh/fduwjj/171/orig -> origin/gh/fduwjj/171/orig 2025-09-07T07:46:33.5779885Z * [new branch] gh/fduwjj/175/base -> origin/gh/fduwjj/175/base 2025-09-07T07:46:33.5780113Z * [new branch] gh/fduwjj/175/head -> origin/gh/fduwjj/175/head 2025-09-07T07:46:33.5780325Z * [new branch] gh/fduwjj/175/orig -> origin/gh/fduwjj/175/orig 2025-09-07T07:46:33.5780549Z * [new branch] gh/fduwjj/176/base -> origin/gh/fduwjj/176/base 2025-09-07T07:46:33.5780760Z * [new branch] gh/fduwjj/176/head -> origin/gh/fduwjj/176/head 2025-09-07T07:46:33.5780970Z * [new branch] gh/fduwjj/176/orig -> origin/gh/fduwjj/176/orig 2025-09-07T07:46:33.5781198Z * [new branch] gh/fduwjj/177/base -> origin/gh/fduwjj/177/base 2025-09-07T07:46:33.5781410Z * [new branch] gh/fduwjj/177/head -> origin/gh/fduwjj/177/head 2025-09-07T07:46:33.5781638Z * [new branch] gh/fduwjj/177/orig -> origin/gh/fduwjj/177/orig 2025-09-07T07:46:33.5781851Z * [new branch] gh/fduwjj/178/base -> origin/gh/fduwjj/178/base 2025-09-07T07:46:33.5782065Z * [new branch] gh/fduwjj/178/head -> origin/gh/fduwjj/178/head 2025-09-07T07:46:33.5782284Z * [new branch] gh/fduwjj/178/orig -> origin/gh/fduwjj/178/orig 2025-09-07T07:46:33.5782494Z * [new branch] gh/fduwjj/179/base -> origin/gh/fduwjj/179/base 2025-09-07T07:46:33.5782719Z * [new branch] gh/fduwjj/179/head -> origin/gh/fduwjj/179/head 2025-09-07T07:46:33.5782932Z * [new branch] gh/fduwjj/179/orig -> origin/gh/fduwjj/179/orig 2025-09-07T07:46:33.5783159Z * [new branch] gh/fduwjj/180/base -> origin/gh/fduwjj/180/base 2025-09-07T07:46:33.5783372Z * [new branch] gh/fduwjj/180/head -> origin/gh/fduwjj/180/head 2025-09-07T07:46:33.5783588Z * [new branch] gh/fduwjj/180/orig -> origin/gh/fduwjj/180/orig 2025-09-07T07:46:33.5783811Z * [new branch] gh/fduwjj/181/base -> origin/gh/fduwjj/181/base 2025-09-07T07:46:33.5784122Z * [new branch] gh/fduwjj/181/head -> origin/gh/fduwjj/181/head 2025-09-07T07:46:33.5784351Z * [new branch] gh/fduwjj/181/orig -> origin/gh/fduwjj/181/orig 2025-09-07T07:46:33.5784561Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-09-07T07:46:33.5784777Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-09-07T07:46:33.5785000Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-09-07T07:46:33.5785212Z * [new branch] gh/fduwjj/183/base -> origin/gh/fduwjj/183/base 2025-09-07T07:46:33.5785433Z * [new branch] gh/fduwjj/183/head -> origin/gh/fduwjj/183/head 2025-09-07T07:46:33.5785650Z * [new branch] gh/fduwjj/183/orig -> origin/gh/fduwjj/183/orig 2025-09-07T07:46:33.5785877Z * [new branch] gh/fduwjj/184/base -> origin/gh/fduwjj/184/base 2025-09-07T07:46:33.5786093Z * [new branch] gh/fduwjj/184/head -> origin/gh/fduwjj/184/head 2025-09-07T07:46:33.5786305Z * [new branch] gh/fduwjj/184/orig -> origin/gh/fduwjj/184/orig 2025-09-07T07:46:33.5786535Z * [new branch] gh/fduwjj/185/base -> origin/gh/fduwjj/185/base 2025-09-07T07:46:33.5786748Z * [new branch] gh/fduwjj/185/head -> origin/gh/fduwjj/185/head 2025-09-07T07:46:33.5786972Z * [new branch] gh/fduwjj/185/orig -> origin/gh/fduwjj/185/orig 2025-09-07T07:46:33.5787184Z * [new branch] gh/fduwjj/186/base -> origin/gh/fduwjj/186/base 2025-09-07T07:46:33.5787406Z * [new branch] gh/fduwjj/186/head -> origin/gh/fduwjj/186/head 2025-09-07T07:46:33.5787728Z * [new branch] gh/fduwjj/186/orig -> origin/gh/fduwjj/186/orig 2025-09-07T07:46:33.5787945Z * [new branch] gh/fduwjj/187/base -> origin/gh/fduwjj/187/base 2025-09-07T07:46:33.5788170Z * [new branch] gh/fduwjj/187/head -> origin/gh/fduwjj/187/head 2025-09-07T07:46:33.5788380Z * [new branch] gh/fduwjj/187/orig -> origin/gh/fduwjj/187/orig 2025-09-07T07:46:33.5788604Z * [new branch] gh/fduwjj/188/base -> origin/gh/fduwjj/188/base 2025-09-07T07:46:33.5788818Z * [new branch] gh/fduwjj/188/head -> origin/gh/fduwjj/188/head 2025-09-07T07:46:33.5789033Z * [new branch] gh/fduwjj/188/orig -> origin/gh/fduwjj/188/orig 2025-09-07T07:46:33.5789258Z * [new branch] gh/fduwjj/189/base -> origin/gh/fduwjj/189/base 2025-09-07T07:46:33.5789471Z * [new branch] gh/fduwjj/189/head -> origin/gh/fduwjj/189/head 2025-09-07T07:46:33.5789703Z * [new branch] gh/fduwjj/189/orig -> origin/gh/fduwjj/189/orig 2025-09-07T07:46:33.5789919Z * [new branch] gh/fduwjj/190/base -> origin/gh/fduwjj/190/base 2025-09-07T07:46:33.5790144Z * [new branch] gh/fduwjj/190/head -> origin/gh/fduwjj/190/head 2025-09-07T07:46:33.5790356Z * [new branch] gh/fduwjj/190/orig -> origin/gh/fduwjj/190/orig 2025-09-07T07:46:33.5790569Z * [new branch] gh/fduwjj/191/base -> origin/gh/fduwjj/191/base 2025-09-07T07:46:33.5790794Z * [new branch] gh/fduwjj/191/head -> origin/gh/fduwjj/191/head 2025-09-07T07:46:33.5791002Z * [new branch] gh/fduwjj/191/orig -> origin/gh/fduwjj/191/orig 2025-09-07T07:46:33.5791227Z * [new branch] gh/fegin/306/base -> origin/gh/fegin/306/base 2025-09-07T07:46:33.5791436Z * [new branch] gh/fegin/306/head -> origin/gh/fegin/306/head 2025-09-07T07:46:33.5791649Z * [new branch] gh/fegin/306/orig -> origin/gh/fegin/306/orig 2025-09-07T07:46:33.5791966Z * [new branch] gh/fegin/307/base -> origin/gh/fegin/307/base 2025-09-07T07:46:33.5792179Z * [new branch] gh/fegin/307/head -> origin/gh/fegin/307/head 2025-09-07T07:46:33.5792400Z * [new branch] gh/fegin/307/orig -> origin/gh/fegin/307/orig 2025-09-07T07:46:33.5792609Z * [new branch] gh/fegin/308/base -> origin/gh/fegin/308/base 2025-09-07T07:46:33.5792832Z * [new branch] gh/fegin/308/head -> origin/gh/fegin/308/head 2025-09-07T07:46:33.5793041Z * [new branch] gh/fegin/308/orig -> origin/gh/fegin/308/orig 2025-09-07T07:46:33.5793248Z * [new branch] gh/fegin/309/base -> origin/gh/fegin/309/base 2025-09-07T07:46:33.5793470Z * [new branch] gh/fegin/309/head -> origin/gh/fegin/309/head 2025-09-07T07:46:33.5793680Z * [new branch] gh/fegin/309/orig -> origin/gh/fegin/309/orig 2025-09-07T07:46:33.5793906Z * [new branch] gh/fegin/310/base -> origin/gh/fegin/310/base 2025-09-07T07:46:33.5794114Z * [new branch] gh/fegin/310/head -> origin/gh/fegin/310/head 2025-09-07T07:46:33.5794323Z * [new branch] gh/fegin/310/orig -> origin/gh/fegin/310/orig 2025-09-07T07:46:33.5794544Z * [new branch] gh/fegin/311/base -> origin/gh/fegin/311/base 2025-09-07T07:46:33.5794754Z * [new branch] gh/fegin/311/head -> origin/gh/fegin/311/head 2025-09-07T07:46:33.5794974Z * [new branch] gh/fegin/311/orig -> origin/gh/fegin/311/orig 2025-09-07T07:46:33.5795183Z * [new branch] gh/fegin/312/base -> origin/gh/fegin/312/base 2025-09-07T07:46:33.5795403Z * [new branch] gh/fegin/312/head -> origin/gh/fegin/312/head 2025-09-07T07:46:33.5795706Z * [new branch] gh/fegin/312/orig -> origin/gh/fegin/312/orig 2025-09-07T07:46:33.5795919Z * [new branch] gh/fegin/313/base -> origin/gh/fegin/313/base 2025-09-07T07:46:33.5796136Z * [new branch] gh/fegin/313/head -> origin/gh/fegin/313/head 2025-09-07T07:46:33.5796340Z * [new branch] gh/fegin/313/orig -> origin/gh/fegin/313/orig 2025-09-07T07:46:33.5796566Z * [new branch] gh/fffrog/124/base -> origin/gh/fffrog/124/base 2025-09-07T07:46:33.5796780Z * [new branch] gh/fffrog/124/head -> origin/gh/fffrog/124/head 2025-09-07T07:46:33.5796991Z * [new branch] gh/fffrog/124/orig -> origin/gh/fffrog/124/orig 2025-09-07T07:46:33.5797218Z * [new branch] gh/fffrog/129/base -> origin/gh/fffrog/129/base 2025-09-07T07:46:33.5797429Z * [new branch] gh/fffrog/129/head -> origin/gh/fffrog/129/head 2025-09-07T07:46:33.5797653Z * [new branch] gh/fffrog/129/orig -> origin/gh/fffrog/129/orig 2025-09-07T07:46:33.5797867Z * [new branch] gh/fffrog/130/base -> origin/gh/fffrog/130/base 2025-09-07T07:46:33.5798087Z * [new branch] gh/fffrog/130/head -> origin/gh/fffrog/130/head 2025-09-07T07:46:33.5798298Z * [new branch] gh/fffrog/130/orig -> origin/gh/fffrog/130/orig 2025-09-07T07:46:33.5798510Z * [new branch] gh/fffrog/131/base -> origin/gh/fffrog/131/base 2025-09-07T07:46:33.5798734Z * [new branch] gh/fffrog/131/head -> origin/gh/fffrog/131/head 2025-09-07T07:46:33.5798944Z * [new branch] gh/fffrog/131/orig -> origin/gh/fffrog/131/orig 2025-09-07T07:46:33.5799169Z * [new branch] gh/fffrog/132/base -> origin/gh/fffrog/132/base 2025-09-07T07:46:33.5799384Z * [new branch] gh/fffrog/132/head -> origin/gh/fffrog/132/head 2025-09-07T07:46:33.5799611Z * [new branch] gh/fffrog/132/orig -> origin/gh/fffrog/132/orig 2025-09-07T07:46:33.5799914Z * [new branch] gh/fffrog/133/base -> origin/gh/fffrog/133/base 2025-09-07T07:46:33.5800133Z * [new branch] gh/fffrog/133/head -> origin/gh/fffrog/133/head 2025-09-07T07:46:33.5800353Z * [new branch] gh/fffrog/133/orig -> origin/gh/fffrog/133/orig 2025-09-07T07:46:33.5800565Z * [new branch] gh/fffrog/134/base -> origin/gh/fffrog/134/base 2025-09-07T07:46:33.5800787Z * [new branch] gh/fffrog/134/head -> origin/gh/fffrog/134/head 2025-09-07T07:46:33.5800999Z * [new branch] gh/fffrog/134/orig -> origin/gh/fffrog/134/orig 2025-09-07T07:46:33.5801213Z * [new branch] gh/fffrog/135/base -> origin/gh/fffrog/135/base 2025-09-07T07:46:33.5801440Z * [new branch] gh/fffrog/135/head -> origin/gh/fffrog/135/head 2025-09-07T07:46:33.5801651Z * [new branch] gh/fffrog/135/orig -> origin/gh/fffrog/135/orig 2025-09-07T07:46:33.5801879Z * [new branch] gh/fffrog/136/base -> origin/gh/fffrog/136/base 2025-09-07T07:46:33.5802089Z * [new branch] gh/fffrog/136/head -> origin/gh/fffrog/136/head 2025-09-07T07:46:33.5802311Z * [new branch] gh/fffrog/136/orig -> origin/gh/fffrog/136/orig 2025-09-07T07:46:33.5802522Z * [new branch] gh/fffrog/137/base -> origin/gh/fffrog/137/base 2025-09-07T07:46:33.5802736Z * [new branch] gh/fffrog/137/head -> origin/gh/fffrog/137/head 2025-09-07T07:46:33.5803303Z * [new branch] gh/fffrog/137/orig -> origin/gh/fffrog/137/orig 2025-09-07T07:46:33.5803516Z * [new branch] gh/fffrog/138/base -> origin/gh/fffrog/138/base 2025-09-07T07:46:33.5803876Z * [new branch] gh/fffrog/138/head -> origin/gh/fffrog/138/head 2025-09-07T07:46:33.5804088Z * [new branch] gh/fffrog/138/orig -> origin/gh/fffrog/138/orig 2025-09-07T07:46:33.5804315Z * [new branch] gh/fffrog/139/base -> origin/gh/fffrog/139/base 2025-09-07T07:46:33.5804528Z * [new branch] gh/fffrog/139/head -> origin/gh/fffrog/139/head 2025-09-07T07:46:33.5804739Z * [new branch] gh/fffrog/139/orig -> origin/gh/fffrog/139/orig 2025-09-07T07:46:33.5804959Z * [new branch] gh/fffrog/140/base -> origin/gh/fffrog/140/base 2025-09-07T07:46:33.5805171Z * [new branch] gh/fffrog/140/head -> origin/gh/fffrog/140/head 2025-09-07T07:46:33.5805395Z * [new branch] gh/fffrog/140/orig -> origin/gh/fffrog/140/orig 2025-09-07T07:46:33.5805606Z * [new branch] gh/fffrog/141/base -> origin/gh/fffrog/141/base 2025-09-07T07:46:33.5805819Z * [new branch] gh/fffrog/141/head -> origin/gh/fffrog/141/head 2025-09-07T07:46:33.5806043Z * [new branch] gh/fffrog/141/orig -> origin/gh/fffrog/141/orig 2025-09-07T07:46:33.5806258Z * [new branch] gh/fffrog/142/base -> origin/gh/fffrog/142/base 2025-09-07T07:46:33.5806480Z * [new branch] gh/fffrog/142/head -> origin/gh/fffrog/142/head 2025-09-07T07:46:33.5806692Z * [new branch] gh/fffrog/142/orig -> origin/gh/fffrog/142/orig 2025-09-07T07:46:33.5806915Z * [new branch] gh/fffrog/143/base -> origin/gh/fffrog/143/base 2025-09-07T07:46:33.5807130Z * [new branch] gh/fffrog/143/head -> origin/gh/fffrog/143/head 2025-09-07T07:46:33.5807343Z * [new branch] gh/fffrog/143/orig -> origin/gh/fffrog/143/orig 2025-09-07T07:46:33.5807565Z * [new branch] gh/fffrog/144/base -> origin/gh/fffrog/144/base 2025-09-07T07:46:33.5807782Z * [new branch] gh/fffrog/144/head -> origin/gh/fffrog/144/head 2025-09-07T07:46:33.5808009Z * [new branch] gh/fffrog/144/orig -> origin/gh/fffrog/144/orig 2025-09-07T07:46:33.5808346Z * [new branch] gh/fffrog/145/base -> origin/gh/fffrog/145/base 2025-09-07T07:46:33.5808563Z * [new branch] gh/fffrog/145/head -> origin/gh/fffrog/145/head 2025-09-07T07:46:33.5808787Z * [new branch] gh/fffrog/145/orig -> origin/gh/fffrog/145/orig 2025-09-07T07:46:33.5808999Z * [new branch] gh/fffrog/146/base -> origin/gh/fffrog/146/base 2025-09-07T07:46:33.5809225Z * [new branch] gh/fffrog/146/head -> origin/gh/fffrog/146/head 2025-09-07T07:46:33.5809437Z * [new branch] gh/fffrog/146/orig -> origin/gh/fffrog/146/orig 2025-09-07T07:46:33.5809663Z * [new branch] gh/fffrog/147/base -> origin/gh/fffrog/147/base 2025-09-07T07:46:33.5809879Z * [new branch] gh/fffrog/147/head -> origin/gh/fffrog/147/head 2025-09-07T07:46:33.5810095Z * [new branch] gh/fffrog/147/orig -> origin/gh/fffrog/147/orig 2025-09-07T07:46:33.5810327Z * [new branch] gh/fffrog/148/base -> origin/gh/fffrog/148/base 2025-09-07T07:46:33.5810541Z * [new branch] gh/fffrog/148/head -> origin/gh/fffrog/148/head 2025-09-07T07:46:33.5810768Z * [new branch] gh/fffrog/148/orig -> origin/gh/fffrog/148/orig 2025-09-07T07:46:33.5810978Z * [new branch] gh/fffrog/149/base -> origin/gh/fffrog/149/base 2025-09-07T07:46:33.5811198Z * [new branch] gh/fffrog/149/head -> origin/gh/fffrog/149/head 2025-09-07T07:46:33.5811409Z * [new branch] gh/fffrog/149/orig -> origin/gh/fffrog/149/orig 2025-09-07T07:46:33.5811617Z * [new branch] gh/fffrog/150/base -> origin/gh/fffrog/150/base 2025-09-07T07:46:33.5811944Z * [new branch] gh/fffrog/150/head -> origin/gh/fffrog/150/head 2025-09-07T07:46:33.5812160Z * [new branch] gh/fffrog/150/orig -> origin/gh/fffrog/150/orig 2025-09-07T07:46:33.5812386Z * [new branch] gh/fffrog/151/base -> origin/gh/fffrog/151/base 2025-09-07T07:46:33.5812598Z * [new branch] gh/fffrog/151/head -> origin/gh/fffrog/151/head 2025-09-07T07:46:33.5812806Z * [new branch] gh/fffrog/151/orig -> origin/gh/fffrog/151/orig 2025-09-07T07:46:33.5813029Z * [new branch] gh/fffrog/152/base -> origin/gh/fffrog/152/base 2025-09-07T07:46:33.5813239Z * [new branch] gh/fffrog/152/head -> origin/gh/fffrog/152/head 2025-09-07T07:46:33.5813465Z * [new branch] gh/fffrog/153/base -> origin/gh/fffrog/153/base 2025-09-07T07:46:33.5813676Z * [new branch] gh/fffrog/153/head -> origin/gh/fffrog/153/head 2025-09-07T07:46:33.5813904Z * [new branch] gh/fffrog/153/orig -> origin/gh/fffrog/153/orig 2025-09-07T07:46:33.5814136Z * [new branch] gh/gmagogsfm/1/base -> origin/gh/gmagogsfm/1/base 2025-09-07T07:46:33.5814357Z * [new branch] gh/gmagogsfm/1/head -> origin/gh/gmagogsfm/1/head 2025-09-07T07:46:33.5814591Z * [new branch] gh/gmagogsfm/1/orig -> origin/gh/gmagogsfm/1/orig 2025-09-07T07:46:33.5814814Z * [new branch] gh/gmagogsfm/2/base -> origin/gh/gmagogsfm/2/base 2025-09-07T07:46:33.5815046Z * [new branch] gh/gmagogsfm/2/head -> origin/gh/gmagogsfm/2/head 2025-09-07T07:46:33.5815266Z * [new branch] gh/gmagogsfm/2/orig -> origin/gh/gmagogsfm/2/orig 2025-09-07T07:46:33.5815488Z * [new branch] gh/gmagogsfm/3/base -> origin/gh/gmagogsfm/3/base 2025-09-07T07:46:33.5815720Z * [new branch] gh/gmagogsfm/3/head -> origin/gh/gmagogsfm/3/head 2025-09-07T07:46:33.5815943Z * [new branch] gh/gmagogsfm/3/orig -> origin/gh/gmagogsfm/3/orig 2025-09-07T07:46:33.5816301Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-09-07T07:46:33.5816532Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-09-07T07:46:33.5816766Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-09-07T07:46:33.5816988Z * [new branch] gh/guangyey/135/base -> origin/gh/guangyey/135/base 2025-09-07T07:46:33.5817209Z * [new branch] gh/guangyey/135/head -> origin/gh/guangyey/135/head 2025-09-07T07:46:33.5817552Z * [new branch] gh/guangyey/135/orig -> origin/gh/guangyey/135/orig 2025-09-07T07:46:33.5817778Z * [new branch] gh/guangyey/139/base -> origin/gh/guangyey/139/base 2025-09-07T07:46:33.5818016Z * [new branch] gh/guangyey/139/head -> origin/gh/guangyey/139/head 2025-09-07T07:46:33.5818239Z * [new branch] gh/guangyey/139/orig -> origin/gh/guangyey/139/orig 2025-09-07T07:46:33.5818476Z * [new branch] gh/guangyey/140/base -> origin/gh/guangyey/140/base 2025-09-07T07:46:33.5818702Z * [new branch] gh/guangyey/140/head -> origin/gh/guangyey/140/head 2025-09-07T07:46:33.5818923Z * [new branch] gh/guangyey/140/orig -> origin/gh/guangyey/140/orig 2025-09-07T07:46:33.5819163Z * [new branch] gh/guangyey/142/base -> origin/gh/guangyey/142/base 2025-09-07T07:46:33.5819386Z * [new branch] gh/guangyey/142/head -> origin/gh/guangyey/142/head 2025-09-07T07:46:33.5819621Z * [new branch] gh/guangyey/142/orig -> origin/gh/guangyey/142/orig 2025-09-07T07:46:33.5819839Z * [new branch] gh/guangyey/145/base -> origin/gh/guangyey/145/base 2025-09-07T07:46:33.5820187Z * [new branch] gh/guangyey/145/head -> origin/gh/guangyey/145/head 2025-09-07T07:46:33.5820423Z * [new branch] gh/guangyey/145/orig -> origin/gh/guangyey/145/orig 2025-09-07T07:46:33.5820647Z * [new branch] gh/guangyey/153/base -> origin/gh/guangyey/153/base 2025-09-07T07:46:33.5820884Z * [new branch] gh/guangyey/153/head -> origin/gh/guangyey/153/head 2025-09-07T07:46:33.5821104Z * [new branch] gh/guangyey/153/orig -> origin/gh/guangyey/153/orig 2025-09-07T07:46:33.5821340Z * [new branch] gh/guangyey/159/base -> origin/gh/guangyey/159/base 2025-09-07T07:46:33.5821561Z * [new branch] gh/guangyey/159/head -> origin/gh/guangyey/159/head 2025-09-07T07:46:33.5821783Z * [new branch] gh/guangyey/159/orig -> origin/gh/guangyey/159/orig 2025-09-07T07:46:33.5822018Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-09-07T07:46:33.5822246Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-09-07T07:46:33.5822485Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-09-07T07:46:33.5822710Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-09-07T07:46:33.5822933Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-09-07T07:46:33.5823169Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-09-07T07:46:33.5823392Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-09-07T07:46:33.5823625Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-09-07T07:46:33.5823848Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-09-07T07:46:33.5824080Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-09-07T07:46:33.5824308Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-09-07T07:46:33.5824634Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-09-07T07:46:33.5824870Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-09-07T07:46:33.5825093Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-09-07T07:46:33.5825329Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-09-07T07:46:33.5825552Z * [new branch] gh/guangyey/174/base -> origin/gh/guangyey/174/base 2025-09-07T07:46:33.5825786Z * [new branch] gh/guangyey/174/head -> origin/gh/guangyey/174/head 2025-09-07T07:46:33.5826011Z * [new branch] gh/guangyey/174/orig -> origin/gh/guangyey/174/orig 2025-09-07T07:46:33.5826239Z * [new branch] gh/guangyey/176/base -> origin/gh/guangyey/176/base 2025-09-07T07:46:33.5826472Z * [new branch] gh/guangyey/176/head -> origin/gh/guangyey/176/head 2025-09-07T07:46:33.5826700Z * [new branch] gh/guangyey/176/orig -> origin/gh/guangyey/176/orig 2025-09-07T07:46:33.5826933Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-09-07T07:46:33.5827155Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-09-07T07:46:33.5827381Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-09-07T07:46:33.5827616Z * [new branch] gh/guangyey/181/base -> origin/gh/guangyey/181/base 2025-09-07T07:46:33.5827849Z * [new branch] gh/guangyey/181/head -> origin/gh/guangyey/181/head 2025-09-07T07:46:33.5828086Z * [new branch] gh/guangyey/181/orig -> origin/gh/guangyey/181/orig 2025-09-07T07:46:33.5828405Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-09-07T07:46:33.5828644Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-09-07T07:46:33.5828872Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-09-07T07:46:33.5829101Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-09-07T07:46:33.5829337Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-09-07T07:46:33.5829566Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-09-07T07:46:33.5829802Z * [new branch] gh/guangyey/184/base -> origin/gh/guangyey/184/base 2025-09-07T07:46:33.5830027Z * [new branch] gh/guangyey/184/head -> origin/gh/guangyey/184/head 2025-09-07T07:46:33.5830252Z * [new branch] gh/guangyey/184/orig -> origin/gh/guangyey/184/orig 2025-09-07T07:46:33.5830496Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-09-07T07:46:33.5830722Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-09-07T07:46:33.5830965Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-09-07T07:46:33.5831190Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-09-07T07:46:33.5831430Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-09-07T07:46:33.5831652Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-09-07T07:46:33.5831874Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-09-07T07:46:33.5832112Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-09-07T07:46:33.5832336Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-09-07T07:46:33.5832574Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-09-07T07:46:33.5832884Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-09-07T07:46:33.5833129Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-09-07T07:46:33.5833358Z * [new branch] gh/guangyey/189/base -> origin/gh/guangyey/189/base 2025-09-07T07:46:33.5833579Z * [new branch] gh/guangyey/189/head -> origin/gh/guangyey/189/head 2025-09-07T07:46:33.5833816Z * [new branch] gh/guangyey/189/orig -> origin/gh/guangyey/189/orig 2025-09-07T07:46:33.5834040Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-09-07T07:46:33.5834273Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-09-07T07:46:33.5834499Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-09-07T07:46:33.5834718Z * [new branch] gh/guangyey/191/base -> origin/gh/guangyey/191/base 2025-09-07T07:46:33.5834959Z * [new branch] gh/guangyey/191/head -> origin/gh/guangyey/191/head 2025-09-07T07:46:33.5835185Z * [new branch] gh/guangyey/191/orig -> origin/gh/guangyey/191/orig 2025-09-07T07:46:33.5835425Z * [new branch] gh/guangyey/192/base -> origin/gh/guangyey/192/base 2025-09-07T07:46:33.5835650Z * [new branch] gh/guangyey/192/head -> origin/gh/guangyey/192/head 2025-09-07T07:46:33.5835889Z * [new branch] gh/guangyey/192/orig -> origin/gh/guangyey/192/orig 2025-09-07T07:46:33.5836109Z * [new branch] gh/guangyey/193/base -> origin/gh/guangyey/193/base 2025-09-07T07:46:33.5836332Z * [new branch] gh/guangyey/193/head -> origin/gh/guangyey/193/head 2025-09-07T07:46:33.5836671Z * [new branch] gh/guangyey/193/orig -> origin/gh/guangyey/193/orig 2025-09-07T07:46:33.5836895Z * [new branch] gh/guangyey/194/base -> origin/gh/guangyey/194/base 2025-09-07T07:46:33.5837134Z * [new branch] gh/guangyey/194/head -> origin/gh/guangyey/194/head 2025-09-07T07:46:33.5837354Z * [new branch] gh/guangyey/194/orig -> origin/gh/guangyey/194/orig 2025-09-07T07:46:33.5837578Z * [new branch] gh/guangyey/195/base -> origin/gh/guangyey/195/base 2025-09-07T07:46:33.5837812Z * [new branch] gh/guangyey/195/head -> origin/gh/guangyey/195/head 2025-09-07T07:46:33.5838038Z * [new branch] gh/guangyey/195/orig -> origin/gh/guangyey/195/orig 2025-09-07T07:46:33.5838277Z * [new branch] gh/guangyey/196/base -> origin/gh/guangyey/196/base 2025-09-07T07:46:33.5838499Z * [new branch] gh/guangyey/196/head -> origin/gh/guangyey/196/head 2025-09-07T07:46:33.5838737Z * [new branch] gh/guangyey/196/orig -> origin/gh/guangyey/196/orig 2025-09-07T07:46:33.5838961Z * [new branch] gh/guangyey/197/base -> origin/gh/guangyey/197/base 2025-09-07T07:46:33.5839182Z * [new branch] gh/guangyey/197/head -> origin/gh/guangyey/197/head 2025-09-07T07:46:33.5839419Z * [new branch] gh/guangyey/197/orig -> origin/gh/guangyey/197/orig 2025-09-07T07:46:33.5839635Z * [new branch] gh/guangyey/198/base -> origin/gh/guangyey/198/base 2025-09-07T07:46:33.5839872Z * [new branch] gh/guangyey/198/head -> origin/gh/guangyey/198/head 2025-09-07T07:46:33.5840094Z * [new branch] gh/guangyey/198/orig -> origin/gh/guangyey/198/orig 2025-09-07T07:46:33.5840328Z * [new branch] gh/guangyey/199/base -> origin/gh/guangyey/199/base 2025-09-07T07:46:33.5840550Z * [new branch] gh/guangyey/199/head -> origin/gh/guangyey/199/head 2025-09-07T07:46:33.5840777Z * [new branch] gh/guangyey/199/orig -> origin/gh/guangyey/199/orig 2025-09-07T07:46:33.5841098Z * [new branch] gh/guangyey/200/base -> origin/gh/guangyey/200/base 2025-09-07T07:46:33.5841329Z * [new branch] gh/guangyey/200/head -> origin/gh/guangyey/200/head 2025-09-07T07:46:33.5841564Z * [new branch] gh/guangyey/200/orig -> origin/gh/guangyey/200/orig 2025-09-07T07:46:33.5841786Z * [new branch] gh/guangyey/201/base -> origin/gh/guangyey/201/base 2025-09-07T07:46:33.5842007Z * [new branch] gh/guangyey/201/head -> origin/gh/guangyey/201/head 2025-09-07T07:46:33.5842242Z * [new branch] gh/guangyey/201/orig -> origin/gh/guangyey/201/orig 2025-09-07T07:46:33.5842462Z * [new branch] gh/guangyey/202/base -> origin/gh/guangyey/202/base 2025-09-07T07:46:33.5842697Z * [new branch] gh/guangyey/202/head -> origin/gh/guangyey/202/head 2025-09-07T07:46:33.5843066Z * [new branch] gh/guangyey/202/orig -> origin/gh/guangyey/202/orig 2025-09-07T07:46:33.5843307Z * [new branch] gh/guangyey/203/base -> origin/gh/guangyey/203/base 2025-09-07T07:46:33.5843532Z * [new branch] gh/guangyey/203/head -> origin/gh/guangyey/203/head 2025-09-07T07:46:33.5843757Z * [new branch] gh/guangyey/203/orig -> origin/gh/guangyey/203/orig 2025-09-07T07:46:33.5843990Z * [new branch] gh/guangyey/204/base -> origin/gh/guangyey/204/base 2025-09-07T07:46:33.5844214Z * [new branch] gh/guangyey/204/head -> origin/gh/guangyey/204/head 2025-09-07T07:46:33.5844447Z * [new branch] gh/guangyey/204/orig -> origin/gh/guangyey/204/orig 2025-09-07T07:46:33.5844669Z * [new branch] gh/guangyey/205/base -> origin/gh/guangyey/205/base 2025-09-07T07:46:33.5845022Z * [new branch] gh/guangyey/205/head -> origin/gh/guangyey/205/head 2025-09-07T07:46:33.5845249Z * [new branch] gh/guangyey/205/orig -> origin/gh/guangyey/205/orig 2025-09-07T07:46:33.5845474Z * [new branch] gh/guangyey/206/base -> origin/gh/guangyey/206/base 2025-09-07T07:46:33.5845711Z * [new branch] gh/guangyey/206/head -> origin/gh/guangyey/206/head 2025-09-07T07:46:33.5845933Z * [new branch] gh/guangyey/206/orig -> origin/gh/guangyey/206/orig 2025-09-07T07:46:33.5846165Z * [new branch] gh/guangyey/207/base -> origin/gh/guangyey/207/base 2025-09-07T07:46:33.5846387Z * [new branch] gh/guangyey/207/head -> origin/gh/guangyey/207/head 2025-09-07T07:46:33.5846610Z * [new branch] gh/guangyey/207/orig -> origin/gh/guangyey/207/orig 2025-09-07T07:46:33.5846845Z * [new branch] gh/guangyey/79/base -> origin/gh/guangyey/79/base 2025-09-07T07:46:33.5847072Z * [new branch] gh/guangyey/79/head -> origin/gh/guangyey/79/head 2025-09-07T07:46:33.5847308Z * [new branch] gh/guangyey/79/orig -> origin/gh/guangyey/79/orig 2025-09-07T07:46:33.5847530Z * [new branch] gh/guangyey/89/base -> origin/gh/guangyey/89/base 2025-09-07T07:46:33.5847761Z * [new branch] gh/guangyey/89/head -> origin/gh/guangyey/89/head 2025-09-07T07:46:33.5847981Z * [new branch] gh/guangyey/89/orig -> origin/gh/guangyey/89/orig 2025-09-07T07:46:33.5848258Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-09-07T07:46:33.5848542Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-09-07T07:46:33.5848807Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-09-07T07:46:33.5849086Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-09-07T07:46:33.5849356Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-09-07T07:46:33.5849753Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-09-07T07:46:33.5850037Z * [new branch] gh/guilhermeleobas/124/base -> origin/gh/guilhermeleobas/124/base 2025-09-07T07:46:33.5850306Z * [new branch] gh/guilhermeleobas/124/head -> origin/gh/guilhermeleobas/124/head 2025-09-07T07:46:33.5850588Z * [new branch] gh/guilhermeleobas/124/orig -> origin/gh/guilhermeleobas/124/orig 2025-09-07T07:46:33.5850856Z * [new branch] gh/guilhermeleobas/147/base -> origin/gh/guilhermeleobas/147/base 2025-09-07T07:46:33.5851141Z * [new branch] gh/guilhermeleobas/147/head -> origin/gh/guilhermeleobas/147/head 2025-09-07T07:46:33.5851408Z * [new branch] gh/guilhermeleobas/147/orig -> origin/gh/guilhermeleobas/147/orig 2025-09-07T07:46:33.5851680Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-09-07T07:46:33.5851967Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-09-07T07:46:33.5852234Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-09-07T07:46:33.5852514Z * [new branch] gh/guilhermeleobas/163/base -> origin/gh/guilhermeleobas/163/base 2025-09-07T07:46:33.5852776Z * [new branch] gh/guilhermeleobas/163/head -> origin/gh/guilhermeleobas/163/head 2025-09-07T07:46:33.5853056Z * [new branch] gh/guilhermeleobas/163/orig -> origin/gh/guilhermeleobas/163/orig 2025-09-07T07:46:33.5853320Z * [new branch] gh/guilhermeleobas/164/base -> origin/gh/guilhermeleobas/164/base 2025-09-07T07:46:33.5853585Z * [new branch] gh/guilhermeleobas/164/head -> origin/gh/guilhermeleobas/164/head 2025-09-07T07:46:33.5853972Z * [new branch] gh/guilhermeleobas/164/orig -> origin/gh/guilhermeleobas/164/orig 2025-09-07T07:46:33.5854240Z * [new branch] gh/guilhermeleobas/165/base -> origin/gh/guilhermeleobas/165/base 2025-09-07T07:46:33.5854521Z * [new branch] gh/guilhermeleobas/165/head -> origin/gh/guilhermeleobas/165/head 2025-09-07T07:46:33.5854785Z * [new branch] gh/guilhermeleobas/165/orig -> origin/gh/guilhermeleobas/165/orig 2025-09-07T07:46:33.5855065Z * [new branch] gh/guilhermeleobas/166/base -> origin/gh/guilhermeleobas/166/base 2025-09-07T07:46:33.5855332Z * [new branch] gh/guilhermeleobas/166/head -> origin/gh/guilhermeleobas/166/head 2025-09-07T07:46:33.5855599Z * [new branch] gh/guilhermeleobas/166/orig -> origin/gh/guilhermeleobas/166/orig 2025-09-07T07:46:33.5855879Z * [new branch] gh/guilhermeleobas/167/base -> origin/gh/guilhermeleobas/167/base 2025-09-07T07:46:33.5856147Z * [new branch] gh/guilhermeleobas/167/head -> origin/gh/guilhermeleobas/167/head 2025-09-07T07:46:33.5856422Z * [new branch] gh/guilhermeleobas/167/orig -> origin/gh/guilhermeleobas/167/orig 2025-09-07T07:46:33.5856689Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-09-07T07:46:33.5856967Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-09-07T07:46:33.5857233Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-09-07T07:46:33.5857582Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-09-07T07:46:33.5857871Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-09-07T07:46:33.5858137Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-09-07T07:46:33.5858416Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-09-07T07:46:33.5858684Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-09-07T07:46:33.5859052Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-09-07T07:46:33.5859337Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-09-07T07:46:33.5859604Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-09-07T07:46:33.5859885Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-09-07T07:46:33.5860150Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-09-07T07:46:33.5860429Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-09-07T07:46:33.5860691Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-09-07T07:46:33.5860964Z * [new branch] gh/guilhermeleobas/192/base -> origin/gh/guilhermeleobas/192/base 2025-09-07T07:46:33.5861245Z * [new branch] gh/guilhermeleobas/192/head -> origin/gh/guilhermeleobas/192/head 2025-09-07T07:46:33.5861511Z * [new branch] gh/guilhermeleobas/192/orig -> origin/gh/guilhermeleobas/192/orig 2025-09-07T07:46:33.5861793Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-09-07T07:46:33.5862057Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-09-07T07:46:33.5862334Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-09-07T07:46:33.5862601Z * [new branch] gh/guilhermeleobas/194/base -> origin/gh/guilhermeleobas/194/base 2025-09-07T07:46:33.5862868Z * [new branch] gh/guilhermeleobas/194/head -> origin/gh/guilhermeleobas/194/head 2025-09-07T07:46:33.5863339Z * [new branch] gh/guilhermeleobas/194/orig -> origin/gh/guilhermeleobas/194/orig 2025-09-07T07:46:33.5863609Z * [new branch] gh/guilhermeleobas/203/base -> origin/gh/guilhermeleobas/203/base 2025-09-07T07:46:33.5863889Z * [new branch] gh/guilhermeleobas/203/head -> origin/gh/guilhermeleobas/203/head 2025-09-07T07:46:33.5864156Z * [new branch] gh/guilhermeleobas/203/orig -> origin/gh/guilhermeleobas/203/orig 2025-09-07T07:46:33.5864437Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-09-07T07:46:33.5864706Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-09-07T07:46:33.5864972Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-09-07T07:46:33.5865250Z * [new branch] gh/guilhermeleobas/205/base -> origin/gh/guilhermeleobas/205/base 2025-09-07T07:46:33.5865519Z * [new branch] gh/guilhermeleobas/205/head -> origin/gh/guilhermeleobas/205/head 2025-09-07T07:46:33.5865799Z * [new branch] gh/guilhermeleobas/205/orig -> origin/gh/guilhermeleobas/205/orig 2025-09-07T07:46:33.5866067Z * [new branch] gh/guilhermeleobas/209/base -> origin/gh/guilhermeleobas/209/base 2025-09-07T07:46:33.5866334Z * [new branch] gh/guilhermeleobas/209/head -> origin/gh/guilhermeleobas/209/head 2025-09-07T07:46:33.5866614Z * [new branch] gh/guilhermeleobas/209/orig -> origin/gh/guilhermeleobas/209/orig 2025-09-07T07:46:33.5866878Z * [new branch] gh/guilhermeleobas/210/base -> origin/gh/guilhermeleobas/210/base 2025-09-07T07:46:33.5867155Z * [new branch] gh/guilhermeleobas/210/head -> origin/gh/guilhermeleobas/210/head 2025-09-07T07:46:33.5867424Z * [new branch] gh/guilhermeleobas/210/orig -> origin/gh/guilhermeleobas/210/orig 2025-09-07T07:46:33.5867702Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-09-07T07:46:33.5867973Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-09-07T07:46:33.5868337Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-09-07T07:46:33.5868619Z * [new branch] gh/guilhermeleobas/214/base -> origin/gh/guilhermeleobas/214/base 2025-09-07T07:46:33.5868885Z * [new branch] gh/guilhermeleobas/214/head -> origin/gh/guilhermeleobas/214/head 2025-09-07T07:46:33.5869162Z * [new branch] gh/guilhermeleobas/214/orig -> origin/gh/guilhermeleobas/214/orig 2025-09-07T07:46:33.5869426Z * [new branch] gh/guilhermeleobas/215/base -> origin/gh/guilhermeleobas/215/base 2025-09-07T07:46:33.5869705Z * [new branch] gh/guilhermeleobas/215/head -> origin/gh/guilhermeleobas/215/head 2025-09-07T07:46:33.5869971Z * [new branch] gh/guilhermeleobas/215/orig -> origin/gh/guilhermeleobas/215/orig 2025-09-07T07:46:33.5870242Z * [new branch] gh/guilhermeleobas/216/base -> origin/gh/guilhermeleobas/216/base 2025-09-07T07:46:33.5870521Z * [new branch] gh/guilhermeleobas/216/head -> origin/gh/guilhermeleobas/216/head 2025-09-07T07:46:33.5870786Z * [new branch] gh/guilhermeleobas/216/orig -> origin/gh/guilhermeleobas/216/orig 2025-09-07T07:46:33.5871064Z * [new branch] gh/guilhermeleobas/217/base -> origin/gh/guilhermeleobas/217/base 2025-09-07T07:46:33.5871329Z * [new branch] gh/guilhermeleobas/217/head -> origin/gh/guilhermeleobas/217/head 2025-09-07T07:46:33.5885448Z * [new branch] gh/guilhermeleobas/217/orig -> origin/gh/guilhermeleobas/217/orig 2025-09-07T07:46:33.5885890Z * [new branch] gh/guilhermeleobas/219/base -> origin/gh/guilhermeleobas/219/base 2025-09-07T07:46:33.5886171Z * [new branch] gh/guilhermeleobas/219/head -> origin/gh/guilhermeleobas/219/head 2025-09-07T07:46:33.5886718Z * [new branch] gh/guilhermeleobas/219/orig -> origin/gh/guilhermeleobas/219/orig 2025-09-07T07:46:33.5886987Z * [new branch] gh/guilhermeleobas/220/base -> origin/gh/guilhermeleobas/220/base 2025-09-07T07:46:33.5887255Z * [new branch] gh/guilhermeleobas/220/head -> origin/gh/guilhermeleobas/220/head 2025-09-07T07:46:33.5887527Z * [new branch] gh/guilhermeleobas/220/orig -> origin/gh/guilhermeleobas/220/orig 2025-09-07T07:46:33.5887788Z * [new branch] gh/guilhermeleobas/221/base -> origin/gh/guilhermeleobas/221/base 2025-09-07T07:46:33.5888062Z * [new branch] gh/guilhermeleobas/221/head -> origin/gh/guilhermeleobas/221/head 2025-09-07T07:46:33.5888325Z * [new branch] gh/guilhermeleobas/221/orig -> origin/gh/guilhermeleobas/221/orig 2025-09-07T07:46:33.5888602Z * [new branch] gh/guilhermeleobas/222/base -> origin/gh/guilhermeleobas/222/base 2025-09-07T07:46:33.5888871Z * [new branch] gh/guilhermeleobas/222/head -> origin/gh/guilhermeleobas/222/head 2025-09-07T07:46:33.5889137Z * [new branch] gh/guilhermeleobas/222/orig -> origin/gh/guilhermeleobas/222/orig 2025-09-07T07:46:33.5889419Z * [new branch] gh/guilhermeleobas/223/base -> origin/gh/guilhermeleobas/223/base 2025-09-07T07:46:33.5889686Z * [new branch] gh/guilhermeleobas/223/head -> origin/gh/guilhermeleobas/223/head 2025-09-07T07:46:33.5889965Z * [new branch] gh/guilhermeleobas/223/orig -> origin/gh/guilhermeleobas/223/orig 2025-09-07T07:46:33.5890231Z * [new branch] gh/guilhermeleobas/224/base -> origin/gh/guilhermeleobas/224/base 2025-09-07T07:46:33.5890510Z * [new branch] gh/guilhermeleobas/224/head -> origin/gh/guilhermeleobas/224/head 2025-09-07T07:46:33.5890771Z * [new branch] gh/guilhermeleobas/224/orig -> origin/gh/guilhermeleobas/224/orig 2025-09-07T07:46:33.5891034Z * [new branch] gh/guilhermeleobas/225/base -> origin/gh/guilhermeleobas/225/base 2025-09-07T07:46:33.5891319Z * [new branch] gh/guilhermeleobas/225/head -> origin/gh/guilhermeleobas/225/head 2025-09-07T07:46:33.5891722Z * [new branch] gh/guilhermeleobas/225/orig -> origin/gh/guilhermeleobas/225/orig 2025-09-07T07:46:33.5892006Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-09-07T07:46:33.5892274Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-09-07T07:46:33.5892554Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-09-07T07:46:33.5892819Z * [new branch] gh/guilhermeleobas/227/base -> origin/gh/guilhermeleobas/227/base 2025-09-07T07:46:33.5893085Z * [new branch] gh/guilhermeleobas/227/head -> origin/gh/guilhermeleobas/227/head 2025-09-07T07:46:33.5893364Z * [new branch] gh/guilhermeleobas/227/orig -> origin/gh/guilhermeleobas/227/orig 2025-09-07T07:46:33.5893634Z * [new branch] gh/guilhermeleobas/228/base -> origin/gh/guilhermeleobas/228/base 2025-09-07T07:46:33.5893918Z * [new branch] gh/guilhermeleobas/228/head -> origin/gh/guilhermeleobas/228/head 2025-09-07T07:46:33.5894186Z * [new branch] gh/guilhermeleobas/228/orig -> origin/gh/guilhermeleobas/228/orig 2025-09-07T07:46:33.5894461Z * [new branch] gh/guilhermeleobas/229/base -> origin/gh/guilhermeleobas/229/base 2025-09-07T07:46:33.5894728Z * [new branch] gh/guilhermeleobas/229/head -> origin/gh/guilhermeleobas/229/head 2025-09-07T07:46:33.5894995Z * [new branch] gh/guilhermeleobas/229/orig -> origin/gh/guilhermeleobas/229/orig 2025-09-07T07:46:33.5895274Z * [new branch] gh/guilhermeleobas/230/base -> origin/gh/guilhermeleobas/230/base 2025-09-07T07:46:33.5895539Z * [new branch] gh/guilhermeleobas/230/head -> origin/gh/guilhermeleobas/230/head 2025-09-07T07:46:33.5895918Z * [new branch] gh/guilhermeleobas/230/orig -> origin/gh/guilhermeleobas/230/orig 2025-09-07T07:46:33.5896189Z * [new branch] gh/guilhermeleobas/231/base -> origin/gh/guilhermeleobas/231/base 2025-09-07T07:46:33.5896455Z * [new branch] gh/guilhermeleobas/231/head -> origin/gh/guilhermeleobas/231/head 2025-09-07T07:46:33.5896731Z * [new branch] gh/guilhermeleobas/231/orig -> origin/gh/guilhermeleobas/231/orig 2025-09-07T07:46:33.5896998Z * [new branch] gh/guilhermeleobas/232/base -> origin/gh/guilhermeleobas/232/base 2025-09-07T07:46:33.5897278Z * [new branch] gh/guilhermeleobas/232/head -> origin/gh/guilhermeleobas/232/head 2025-09-07T07:46:33.5897648Z * [new branch] gh/guilhermeleobas/232/orig -> origin/gh/guilhermeleobas/232/orig 2025-09-07T07:46:33.5897933Z * [new branch] gh/guilhermeleobas/233/base -> origin/gh/guilhermeleobas/233/base 2025-09-07T07:46:33.5898197Z * [new branch] gh/guilhermeleobas/233/head -> origin/gh/guilhermeleobas/233/head 2025-09-07T07:46:33.5898460Z * [new branch] gh/guilhermeleobas/233/orig -> origin/gh/guilhermeleobas/233/orig 2025-09-07T07:46:33.5898743Z * [new branch] gh/guilhermeleobas/234/base -> origin/gh/guilhermeleobas/234/base 2025-09-07T07:46:33.5899006Z * [new branch] gh/guilhermeleobas/234/head -> origin/gh/guilhermeleobas/234/head 2025-09-07T07:46:33.5899284Z * [new branch] gh/guilhermeleobas/234/orig -> origin/gh/guilhermeleobas/234/orig 2025-09-07T07:46:33.5899547Z * [new branch] gh/guilhermeleobas/235/base -> origin/gh/guilhermeleobas/235/base 2025-09-07T07:46:33.5899825Z * [new branch] gh/guilhermeleobas/235/head -> origin/gh/guilhermeleobas/235/head 2025-09-07T07:46:33.5900090Z * [new branch] gh/guilhermeleobas/235/orig -> origin/gh/guilhermeleobas/235/orig 2025-09-07T07:46:33.5900354Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-09-07T07:46:33.5900642Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-09-07T07:46:33.5901033Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-09-07T07:46:33.5901316Z * [new branch] gh/guilhermeleobas/237/base -> origin/gh/guilhermeleobas/237/base 2025-09-07T07:46:33.5901584Z * [new branch] gh/guilhermeleobas/237/head -> origin/gh/guilhermeleobas/237/head 2025-09-07T07:46:33.5901863Z * [new branch] gh/guilhermeleobas/237/orig -> origin/gh/guilhermeleobas/237/orig 2025-09-07T07:46:33.5902132Z * [new branch] gh/guilhermeleobas/238/base -> origin/gh/guilhermeleobas/238/base 2025-09-07T07:46:33.5902396Z * [new branch] gh/guilhermeleobas/238/head -> origin/gh/guilhermeleobas/238/head 2025-09-07T07:46:33.5902669Z * [new branch] gh/guilhermeleobas/238/orig -> origin/gh/guilhermeleobas/238/orig 2025-09-07T07:46:33.5902941Z * [new branch] gh/guilhermeleobas/239/base -> origin/gh/guilhermeleobas/239/base 2025-09-07T07:46:33.5903222Z * [new branch] gh/guilhermeleobas/239/head -> origin/gh/guilhermeleobas/239/head 2025-09-07T07:46:33.5903488Z * [new branch] gh/guilhermeleobas/239/orig -> origin/gh/guilhermeleobas/239/orig 2025-09-07T07:46:33.5903766Z * [new branch] gh/guilhermeleobas/240/base -> origin/gh/guilhermeleobas/240/base 2025-09-07T07:46:33.5904032Z * [new branch] gh/guilhermeleobas/240/head -> origin/gh/guilhermeleobas/240/head 2025-09-07T07:46:33.5904298Z * [new branch] gh/guilhermeleobas/240/orig -> origin/gh/guilhermeleobas/240/orig 2025-09-07T07:46:33.5904577Z * [new branch] gh/guilhermeleobas/241/base -> origin/gh/guilhermeleobas/241/base 2025-09-07T07:46:33.5904844Z * [new branch] gh/guilhermeleobas/241/head -> origin/gh/guilhermeleobas/241/head 2025-09-07T07:46:33.5905222Z * [new branch] gh/guilhermeleobas/241/orig -> origin/gh/guilhermeleobas/241/orig 2025-09-07T07:46:33.5905490Z * [new branch] gh/guilhermeleobas/242/base -> origin/gh/guilhermeleobas/242/base 2025-09-07T07:46:33.5905761Z * [new branch] gh/guilhermeleobas/242/head -> origin/gh/guilhermeleobas/242/head 2025-09-07T07:46:33.5906043Z * [new branch] gh/guilhermeleobas/242/orig -> origin/gh/guilhermeleobas/242/orig 2025-09-07T07:46:33.5906309Z * [new branch] gh/guilhermeleobas/243/base -> origin/gh/guilhermeleobas/243/base 2025-09-07T07:46:33.5906589Z * [new branch] gh/guilhermeleobas/243/head -> origin/gh/guilhermeleobas/243/head 2025-09-07T07:46:33.5906852Z * [new branch] gh/guilhermeleobas/243/orig -> origin/gh/guilhermeleobas/243/orig 2025-09-07T07:46:33.5907130Z * [new branch] gh/guilhermeleobas/244/base -> origin/gh/guilhermeleobas/244/base 2025-09-07T07:46:33.5907398Z * [new branch] gh/guilhermeleobas/244/head -> origin/gh/guilhermeleobas/244/head 2025-09-07T07:46:33.5907674Z * [new branch] gh/guilhermeleobas/244/orig -> origin/gh/guilhermeleobas/244/orig 2025-09-07T07:46:33.5907948Z * [new branch] gh/guilhermeleobas/245/base -> origin/gh/guilhermeleobas/245/base 2025-09-07T07:46:33.5908212Z * [new branch] gh/guilhermeleobas/245/head -> origin/gh/guilhermeleobas/245/head 2025-09-07T07:46:33.5908488Z * [new branch] gh/guilhermeleobas/245/orig -> origin/gh/guilhermeleobas/245/orig 2025-09-07T07:46:33.5908753Z * [new branch] gh/guilhermeleobas/73/base -> origin/gh/guilhermeleobas/73/base 2025-09-07T07:46:33.5909023Z * [new branch] gh/guilhermeleobas/73/head -> origin/gh/guilhermeleobas/73/head 2025-09-07T07:46:33.5909283Z * [new branch] gh/guilhermeleobas/73/orig -> origin/gh/guilhermeleobas/73/orig 2025-09-07T07:46:33.5909546Z * [new branch] gh/henrylhtsang/140/base -> origin/gh/henrylhtsang/140/base 2025-09-07T07:46:33.5909801Z * [new branch] gh/henrylhtsang/140/head -> origin/gh/henrylhtsang/140/head 2025-09-07T07:46:33.5910178Z * [new branch] gh/henrylhtsang/140/orig -> origin/gh/henrylhtsang/140/orig 2025-09-07T07:46:33.5910443Z * [new branch] gh/henrylhtsang/141/base -> origin/gh/henrylhtsang/141/base 2025-09-07T07:46:33.5910689Z * [new branch] gh/henrylhtsang/141/head -> origin/gh/henrylhtsang/141/head 2025-09-07T07:46:33.5910940Z * [new branch] gh/henrylhtsang/141/orig -> origin/gh/henrylhtsang/141/orig 2025-09-07T07:46:33.5911185Z * [new branch] gh/henrylhtsang/142/base -> origin/gh/henrylhtsang/142/base 2025-09-07T07:46:33.5911431Z * [new branch] gh/henrylhtsang/142/head -> origin/gh/henrylhtsang/142/head 2025-09-07T07:46:33.5911684Z * [new branch] gh/henrylhtsang/142/orig -> origin/gh/henrylhtsang/142/orig 2025-09-07T07:46:33.5911935Z * [new branch] gh/henrylhtsang/143/base -> origin/gh/henrylhtsang/143/base 2025-09-07T07:46:33.5912192Z * [new branch] gh/henrylhtsang/143/head -> origin/gh/henrylhtsang/143/head 2025-09-07T07:46:33.5912438Z * [new branch] gh/henrylhtsang/143/orig -> origin/gh/henrylhtsang/143/orig 2025-09-07T07:46:33.5912696Z * [new branch] gh/henrylhtsang/144/base -> origin/gh/henrylhtsang/144/base 2025-09-07T07:46:33.5912942Z * [new branch] gh/henrylhtsang/144/head -> origin/gh/henrylhtsang/144/head 2025-09-07T07:46:33.5913187Z * [new branch] gh/henrylhtsang/144/orig -> origin/gh/henrylhtsang/144/orig 2025-09-07T07:46:33.5913440Z * [new branch] gh/henrylhtsang/145/base -> origin/gh/henrylhtsang/145/base 2025-09-07T07:46:33.5913681Z * [new branch] gh/henrylhtsang/145/head -> origin/gh/henrylhtsang/145/head 2025-09-07T07:46:33.5913940Z * [new branch] gh/henrylhtsang/145/orig -> origin/gh/henrylhtsang/145/orig 2025-09-07T07:46:33.5914281Z * [new branch] gh/henrylhtsang/146/base -> origin/gh/henrylhtsang/146/base 2025-09-07T07:46:33.5914550Z * [new branch] gh/henrylhtsang/146/head -> origin/gh/henrylhtsang/146/head 2025-09-07T07:46:33.5914798Z * [new branch] gh/henrylhtsang/146/orig -> origin/gh/henrylhtsang/146/orig 2025-09-07T07:46:33.5915043Z * [new branch] gh/henrylhtsang/147/base -> origin/gh/henrylhtsang/147/base 2025-09-07T07:46:33.5915302Z * [new branch] gh/henrylhtsang/147/head -> origin/gh/henrylhtsang/147/head 2025-09-07T07:46:33.5915550Z * [new branch] gh/henrylhtsang/147/orig -> origin/gh/henrylhtsang/147/orig 2025-09-07T07:46:33.5915808Z * [new branch] gh/henrylhtsang/148/base -> origin/gh/henrylhtsang/148/base 2025-09-07T07:46:33.5916058Z * [new branch] gh/henrylhtsang/148/head -> origin/gh/henrylhtsang/148/head 2025-09-07T07:46:33.5916306Z * [new branch] gh/henrylhtsang/148/orig -> origin/gh/henrylhtsang/148/orig 2025-09-07T07:46:33.5916566Z * [new branch] gh/henrylhtsang/149/base -> origin/gh/henrylhtsang/149/base 2025-09-07T07:46:33.5916814Z * [new branch] gh/henrylhtsang/149/head -> origin/gh/henrylhtsang/149/head 2025-09-07T07:46:33.5917074Z * [new branch] gh/henrylhtsang/149/orig -> origin/gh/henrylhtsang/149/orig 2025-09-07T07:46:33.5917292Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-09-07T07:46:33.5917516Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-09-07T07:46:33.5917722Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-09-07T07:46:33.5917929Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-09-07T07:46:33.5918139Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-09-07T07:46:33.5918350Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-09-07T07:46:33.5918657Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-09-07T07:46:33.5918864Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-09-07T07:46:33.5919094Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-09-07T07:46:33.5919308Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-09-07T07:46:33.5919521Z * [new branch] gh/isuruf/141/base -> origin/gh/isuruf/141/base 2025-09-07T07:46:33.5919746Z * [new branch] gh/isuruf/141/head -> origin/gh/isuruf/141/head 2025-09-07T07:46:33.5919959Z * [new branch] gh/isuruf/141/orig -> origin/gh/isuruf/141/orig 2025-09-07T07:46:33.5920182Z * [new branch] gh/isuruf/142/base -> origin/gh/isuruf/142/base 2025-09-07T07:46:33.5920398Z * [new branch] gh/isuruf/142/head -> origin/gh/isuruf/142/head 2025-09-07T07:46:33.5920610Z * [new branch] gh/isuruf/142/orig -> origin/gh/isuruf/142/orig 2025-09-07T07:46:33.5920840Z * [new branch] gh/isuruf/143/base -> origin/gh/isuruf/143/base 2025-09-07T07:46:33.5921053Z * [new branch] gh/isuruf/143/head -> origin/gh/isuruf/143/head 2025-09-07T07:46:33.5921280Z * [new branch] gh/isuruf/143/orig -> origin/gh/isuruf/143/orig 2025-09-07T07:46:33.5921489Z * [new branch] gh/isuruf/144/base -> origin/gh/isuruf/144/base 2025-09-07T07:46:33.5921708Z * [new branch] gh/isuruf/144/head -> origin/gh/isuruf/144/head 2025-09-07T07:46:33.5921920Z * [new branch] gh/isuruf/144/orig -> origin/gh/isuruf/144/orig 2025-09-07T07:46:33.5922131Z * [new branch] gh/isuruf/145/base -> origin/gh/isuruf/145/base 2025-09-07T07:46:33.5922464Z * [new branch] gh/isuruf/145/head -> origin/gh/isuruf/145/head 2025-09-07T07:46:33.5922680Z * [new branch] gh/isuruf/145/orig -> origin/gh/isuruf/145/orig 2025-09-07T07:46:33.5923146Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-09-07T07:46:33.5923361Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-09-07T07:46:33.5923575Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-09-07T07:46:33.5923802Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-09-07T07:46:33.5924014Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-09-07T07:46:33.5924239Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-09-07T07:46:33.5924468Z * [new branch] gh/jamesjwu/150/base -> origin/gh/jamesjwu/150/base 2025-09-07T07:46:33.5924707Z * [new branch] gh/jamesjwu/150/head -> origin/gh/jamesjwu/150/head 2025-09-07T07:46:33.5924929Z * [new branch] gh/jamesjwu/150/orig -> origin/gh/jamesjwu/150/orig 2025-09-07T07:46:33.5925152Z * [new branch] gh/jamesjwu/154/base -> origin/gh/jamesjwu/154/base 2025-09-07T07:46:33.5925381Z * [new branch] gh/jamesjwu/154/head -> origin/gh/jamesjwu/154/head 2025-09-07T07:46:33.5925601Z * [new branch] gh/jamesjwu/154/orig -> origin/gh/jamesjwu/154/orig 2025-09-07T07:46:33.5925832Z * [new branch] gh/jamesjwu/155/base -> origin/gh/jamesjwu/155/base 2025-09-07T07:46:33.5926050Z * [new branch] gh/jamesjwu/155/head -> origin/gh/jamesjwu/155/head 2025-09-07T07:46:33.5926288Z * [new branch] gh/jamesjwu/155/orig -> origin/gh/jamesjwu/155/orig 2025-09-07T07:46:33.5926509Z * [new branch] gh/jamesjwu/159/base -> origin/gh/jamesjwu/159/base 2025-09-07T07:46:33.5926734Z * [new branch] gh/jamesjwu/159/head -> origin/gh/jamesjwu/159/head 2025-09-07T07:46:33.5927090Z * [new branch] gh/jamesjwu/159/orig -> origin/gh/jamesjwu/159/orig 2025-09-07T07:46:33.5927317Z * [new branch] gh/jamesjwu/163/base -> origin/gh/jamesjwu/163/base 2025-09-07T07:46:33.5927552Z * [new branch] gh/jamesjwu/163/head -> origin/gh/jamesjwu/163/head 2025-09-07T07:46:33.5927776Z * [new branch] gh/jamesjwu/163/orig -> origin/gh/jamesjwu/163/orig 2025-09-07T07:46:33.5927999Z * [new branch] gh/jamesjwu/171/base -> origin/gh/jamesjwu/171/base 2025-09-07T07:46:33.5928233Z * [new branch] gh/jamesjwu/171/head -> origin/gh/jamesjwu/171/head 2025-09-07T07:46:33.5928453Z * [new branch] gh/jamesjwu/171/orig -> origin/gh/jamesjwu/171/orig 2025-09-07T07:46:33.5928686Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-09-07T07:46:33.5928907Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-09-07T07:46:33.5929144Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-09-07T07:46:33.5929366Z * [new branch] gh/jamesjwu/181/base -> origin/gh/jamesjwu/181/base 2025-09-07T07:46:33.5929589Z * [new branch] gh/jamesjwu/181/head -> origin/gh/jamesjwu/181/head 2025-09-07T07:46:33.5929822Z * [new branch] gh/jamesjwu/181/orig -> origin/gh/jamesjwu/181/orig 2025-09-07T07:46:33.5930044Z * [new branch] gh/jamesjwu/182/base -> origin/gh/jamesjwu/182/base 2025-09-07T07:46:33.5930276Z * [new branch] gh/jamesjwu/182/head -> origin/gh/jamesjwu/182/head 2025-09-07T07:46:33.5930498Z * [new branch] gh/jamesjwu/182/orig -> origin/gh/jamesjwu/182/orig 2025-09-07T07:46:33.5930852Z * [new branch] gh/jamesjwu/183/base -> origin/gh/jamesjwu/183/base 2025-09-07T07:46:33.5931089Z * [new branch] gh/jamesjwu/183/head -> origin/gh/jamesjwu/183/head 2025-09-07T07:46:33.5931312Z * [new branch] gh/jamesjwu/183/orig -> origin/gh/jamesjwu/183/orig 2025-09-07T07:46:33.5931544Z * [new branch] gh/jamesjwu/184/base -> origin/gh/jamesjwu/184/base 2025-09-07T07:46:33.5931766Z * [new branch] gh/jamesjwu/184/head -> origin/gh/jamesjwu/184/head 2025-09-07T07:46:33.5931992Z * [new branch] gh/jamesjwu/184/orig -> origin/gh/jamesjwu/184/orig 2025-09-07T07:46:33.5932214Z * [new branch] gh/jamesjwu/185/base -> origin/gh/jamesjwu/185/base 2025-09-07T07:46:33.5932437Z * [new branch] gh/jamesjwu/185/head -> origin/gh/jamesjwu/185/head 2025-09-07T07:46:33.5932671Z * [new branch] gh/jamesjwu/185/orig -> origin/gh/jamesjwu/185/orig 2025-09-07T07:46:33.5932895Z * [new branch] gh/jamesjwu/186/base -> origin/gh/jamesjwu/186/base 2025-09-07T07:46:33.5933134Z * [new branch] gh/jamesjwu/186/head -> origin/gh/jamesjwu/186/head 2025-09-07T07:46:33.5933355Z * [new branch] gh/jamesjwu/186/orig -> origin/gh/jamesjwu/186/orig 2025-09-07T07:46:33.5933590Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-09-07T07:46:33.5933810Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-09-07T07:46:33.5934029Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-09-07T07:46:33.5934263Z * [new branch] gh/jamesjwu/188/base -> origin/gh/jamesjwu/188/base 2025-09-07T07:46:33.5934483Z * [new branch] gh/jamesjwu/188/head -> origin/gh/jamesjwu/188/head 2025-09-07T07:46:33.5934718Z * [new branch] gh/jamesjwu/188/orig -> origin/gh/jamesjwu/188/orig 2025-09-07T07:46:33.5934940Z * [new branch] gh/jamesjwu/189/base -> origin/gh/jamesjwu/189/base 2025-09-07T07:46:33.5935247Z * [new branch] gh/jamesjwu/189/head -> origin/gh/jamesjwu/189/head 2025-09-07T07:46:33.5935485Z * [new branch] gh/jamesjwu/189/orig -> origin/gh/jamesjwu/189/orig 2025-09-07T07:46:33.5935708Z * [new branch] gh/jamesjwu/190/base -> origin/gh/jamesjwu/190/base 2025-09-07T07:46:33.5935941Z * [new branch] gh/jamesjwu/190/head -> origin/gh/jamesjwu/190/head 2025-09-07T07:46:33.5936165Z * [new branch] gh/jamesjwu/190/orig -> origin/gh/jamesjwu/190/orig 2025-09-07T07:46:33.5936394Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-09-07T07:46:33.5936617Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-09-07T07:46:33.5936840Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-09-07T07:46:33.5937071Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-09-07T07:46:33.5937370Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-09-07T07:46:33.5937613Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-09-07T07:46:33.5937834Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-09-07T07:46:33.5938050Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-09-07T07:46:33.5938278Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-09-07T07:46:33.5938496Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-09-07T07:46:33.5938723Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-09-07T07:46:33.5939057Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-09-07T07:46:33.5939289Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-09-07T07:46:33.5939511Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-09-07T07:46:33.5939730Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-09-07T07:46:33.5939962Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-09-07T07:46:33.5940182Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-09-07T07:46:33.5940414Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-09-07T07:46:33.5940634Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-09-07T07:46:33.5940867Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-09-07T07:46:33.5941089Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-09-07T07:46:33.5941309Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-09-07T07:46:33.5941543Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-09-07T07:46:33.5941761Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-09-07T07:46:33.5941987Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-09-07T07:46:33.5942204Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-09-07T07:46:33.5942420Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-09-07T07:46:33.5942651Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-09-07T07:46:33.5942873Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-09-07T07:46:33.5943110Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-09-07T07:46:33.5943440Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-09-07T07:46:33.5943676Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-09-07T07:46:33.5943897Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-09-07T07:46:33.5944114Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-09-07T07:46:33.5944344Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-09-07T07:46:33.5944563Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-09-07T07:46:33.5944797Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-09-07T07:46:33.5945016Z * [new branch] gh/janeyx99/296/base -> origin/gh/janeyx99/296/base 2025-09-07T07:46:33.5945241Z * [new branch] gh/janeyx99/296/head -> origin/gh/janeyx99/296/head 2025-09-07T07:46:33.5945469Z * [new branch] gh/janeyx99/296/orig -> origin/gh/janeyx99/296/orig 2025-09-07T07:46:33.5945692Z * [new branch] gh/janeyx99/297/base -> origin/gh/janeyx99/297/base 2025-09-07T07:46:33.5945922Z * [new branch] gh/janeyx99/297/head -> origin/gh/janeyx99/297/head 2025-09-07T07:46:33.5946141Z * [new branch] gh/janeyx99/297/orig -> origin/gh/janeyx99/297/orig 2025-09-07T07:46:33.5946374Z * [new branch] gh/janeyx99/298/base -> origin/gh/janeyx99/298/base 2025-09-07T07:46:33.5946592Z * [new branch] gh/janeyx99/298/head -> origin/gh/janeyx99/298/head 2025-09-07T07:46:33.5946810Z * [new branch] gh/janeyx99/298/orig -> origin/gh/janeyx99/298/orig 2025-09-07T07:46:33.5947131Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-09-07T07:46:33.5947349Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-09-07T07:46:33.5947579Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-09-07T07:46:33.5947799Z * [new branch] gh/janeyx99/300/base -> origin/gh/janeyx99/300/base 2025-09-07T07:46:33.5948032Z * [new branch] gh/janeyx99/300/head -> origin/gh/janeyx99/300/head 2025-09-07T07:46:33.5948253Z * [new branch] gh/janeyx99/300/orig -> origin/gh/janeyx99/300/orig 2025-09-07T07:46:33.5948477Z * [new branch] gh/janeyx99/301/base -> origin/gh/janeyx99/301/base 2025-09-07T07:46:33.5948707Z * [new branch] gh/janeyx99/301/head -> origin/gh/janeyx99/301/head 2025-09-07T07:46:33.5948924Z * [new branch] gh/janeyx99/301/orig -> origin/gh/janeyx99/301/orig 2025-09-07T07:46:33.5949161Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-09-07T07:46:33.5949384Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-09-07T07:46:33.5949601Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-09-07T07:46:33.5949833Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-09-07T07:46:33.5950047Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-09-07T07:46:33.5950276Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-09-07T07:46:33.5950490Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-09-07T07:46:33.5950716Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-09-07T07:46:33.5950927Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-09-07T07:46:33.5951143Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-09-07T07:46:33.5951466Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-09-07T07:46:33.5951679Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-09-07T07:46:33.5951904Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-09-07T07:46:33.5952109Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-09-07T07:46:33.5952320Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-09-07T07:46:33.5952545Z * [new branch] gh/jansel/531/base -> origin/gh/jansel/531/base 2025-09-07T07:46:33.5952758Z * [new branch] gh/jansel/531/head -> origin/gh/jansel/531/head 2025-09-07T07:46:33.5952979Z * [new branch] gh/jansel/531/orig -> origin/gh/jansel/531/orig 2025-09-07T07:46:33.5953230Z * [new branch] gh/jbschlosser/208/head -> origin/gh/jbschlosser/208/head 2025-09-07T07:46:33.5953486Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-09-07T07:46:33.5953727Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-09-07T07:46:33.5953967Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-09-07T07:46:33.5954215Z * [new branch] gh/jbschlosser/248/base -> origin/gh/jbschlosser/248/base 2025-09-07T07:46:33.5954455Z * [new branch] gh/jbschlosser/248/head -> origin/gh/jbschlosser/248/head 2025-09-07T07:46:33.5954703Z * [new branch] gh/jbschlosser/248/orig -> origin/gh/jbschlosser/248/orig 2025-09-07T07:46:33.5954944Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-09-07T07:46:33.5955297Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-09-07T07:46:33.5955538Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-09-07T07:46:33.5955771Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-09-07T07:46:33.5956008Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-09-07T07:46:33.5956232Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-09-07T07:46:33.5956468Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-09-07T07:46:33.5956688Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-09-07T07:46:33.5956912Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-09-07T07:46:33.5957148Z * [new branch] gh/jiayisunx/64/base -> origin/gh/jiayisunx/64/base 2025-09-07T07:46:33.5957376Z * [new branch] gh/jiayisunx/64/head -> origin/gh/jiayisunx/64/head 2025-09-07T07:46:33.5957617Z * [new branch] gh/jiayisunx/64/orig -> origin/gh/jiayisunx/64/orig 2025-09-07T07:46:33.5957839Z * [new branch] gh/jiayisunx/65/base -> origin/gh/jiayisunx/65/base 2025-09-07T07:46:33.5958072Z * [new branch] gh/jiayisunx/65/head -> origin/gh/jiayisunx/65/head 2025-09-07T07:46:33.5958298Z * [new branch] gh/jiayisunx/65/orig -> origin/gh/jiayisunx/65/orig 2025-09-07T07:46:33.5958519Z * [new branch] gh/jiayisunx/66/base -> origin/gh/jiayisunx/66/base 2025-09-07T07:46:33.5958757Z * [new branch] gh/jiayisunx/66/head -> origin/gh/jiayisunx/66/head 2025-09-07T07:46:33.5958980Z * [new branch] gh/jiayisunx/66/orig -> origin/gh/jiayisunx/66/orig 2025-09-07T07:46:33.5959219Z * [new branch] gh/jiayisunx/67/base -> origin/gh/jiayisunx/67/base 2025-09-07T07:46:33.5959447Z * [new branch] gh/jiayisunx/67/head -> origin/gh/jiayisunx/67/head 2025-09-07T07:46:33.5959782Z * [new branch] gh/jiayisunx/67/orig -> origin/gh/jiayisunx/67/orig 2025-09-07T07:46:33.5960012Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-09-07T07:46:33.5960239Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-09-07T07:46:33.5960475Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-09-07T07:46:33.5960699Z * [new branch] gh/jiayisunx/69/base -> origin/gh/jiayisunx/69/base 2025-09-07T07:46:33.5960936Z * [new branch] gh/jiayisunx/69/head -> origin/gh/jiayisunx/69/head 2025-09-07T07:46:33.5961161Z * [new branch] gh/jiayisunx/69/orig -> origin/gh/jiayisunx/69/orig 2025-09-07T07:46:33.5961388Z * [new branch] gh/jiayisunx/70/base -> origin/gh/jiayisunx/70/base 2025-09-07T07:46:33.5961627Z * [new branch] gh/jiayisunx/70/head -> origin/gh/jiayisunx/70/head 2025-09-07T07:46:33.5961855Z * [new branch] gh/jiayisunx/70/orig -> origin/gh/jiayisunx/70/orig 2025-09-07T07:46:33.5962094Z * [new branch] gh/jiayisunx/71/base -> origin/gh/jiayisunx/71/base 2025-09-07T07:46:33.5962319Z * [new branch] gh/jiayisunx/71/head -> origin/gh/jiayisunx/71/head 2025-09-07T07:46:33.5962557Z * [new branch] gh/jiayisunx/71/orig -> origin/gh/jiayisunx/71/orig 2025-09-07T07:46:33.5962782Z * [new branch] gh/jiayisunx/72/base -> origin/gh/jiayisunx/72/base 2025-09-07T07:46:33.5963138Z * [new branch] gh/jiayisunx/72/head -> origin/gh/jiayisunx/72/head 2025-09-07T07:46:33.5963381Z * [new branch] gh/jiayisunx/72/orig -> origin/gh/jiayisunx/72/orig 2025-09-07T07:46:33.5963747Z * [new branch] gh/jiayisunx/73/base -> origin/gh/jiayisunx/73/base 2025-09-07T07:46:33.5963982Z * [new branch] gh/jiayisunx/73/head -> origin/gh/jiayisunx/73/head 2025-09-07T07:46:33.5964210Z * [new branch] gh/jiayisunx/73/orig -> origin/gh/jiayisunx/73/orig 2025-09-07T07:46:33.5964436Z * [new branch] gh/jiayisunx/74/base -> origin/gh/jiayisunx/74/base 2025-09-07T07:46:33.5964671Z * [new branch] gh/jiayisunx/74/head -> origin/gh/jiayisunx/74/head 2025-09-07T07:46:33.5964894Z * [new branch] gh/jiayisunx/74/orig -> origin/gh/jiayisunx/74/orig 2025-09-07T07:46:33.5965131Z * [new branch] gh/jiayisunx/75/base -> origin/gh/jiayisunx/75/base 2025-09-07T07:46:33.5965352Z * [new branch] gh/jiayisunx/75/head -> origin/gh/jiayisunx/75/head 2025-09-07T07:46:33.5965587Z * [new branch] gh/jiayisunx/75/orig -> origin/gh/jiayisunx/75/orig 2025-09-07T07:46:33.5965814Z * [new branch] gh/jiayisunx/76/base -> origin/gh/jiayisunx/76/base 2025-09-07T07:46:33.5966041Z * [new branch] gh/jiayisunx/76/head -> origin/gh/jiayisunx/76/head 2025-09-07T07:46:33.5966279Z * [new branch] gh/jiayisunx/76/orig -> origin/gh/jiayisunx/76/orig 2025-09-07T07:46:33.5966515Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-09-07T07:46:33.5966756Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-09-07T07:46:33.5967002Z * [new branch] gh/justinchuby/111/base -> origin/gh/justinchuby/111/base 2025-09-07T07:46:33.5967255Z * [new branch] gh/justinchuby/111/head -> origin/gh/justinchuby/111/head 2025-09-07T07:46:33.5967499Z * [new branch] gh/justinchuby/111/orig -> origin/gh/justinchuby/111/orig 2025-09-07T07:46:33.5967739Z * [new branch] gh/justinchuby/112/base -> origin/gh/justinchuby/112/base 2025-09-07T07:46:33.5967995Z * [new branch] gh/justinchuby/112/head -> origin/gh/justinchuby/112/head 2025-09-07T07:46:33.5968365Z * [new branch] gh/justinchuby/112/orig -> origin/gh/justinchuby/112/orig 2025-09-07T07:46:33.5968615Z * [new branch] gh/justinchuby/113/base -> origin/gh/justinchuby/113/base 2025-09-07T07:46:33.5968848Z * [new branch] gh/justinchuby/113/head -> origin/gh/justinchuby/113/head 2025-09-07T07:46:33.5969082Z * [new branch] gh/justinchuby/113/orig -> origin/gh/justinchuby/113/orig 2025-09-07T07:46:33.5969321Z * [new branch] gh/justinchuby/114/base -> origin/gh/justinchuby/114/base 2025-09-07T07:46:33.5969557Z * [new branch] gh/justinchuby/114/head -> origin/gh/justinchuby/114/head 2025-09-07T07:46:33.5969800Z * [new branch] gh/justinchuby/114/orig -> origin/gh/justinchuby/114/orig 2025-09-07T07:46:33.5970042Z * [new branch] gh/justinchuby/115/base -> origin/gh/justinchuby/115/base 2025-09-07T07:46:33.5970298Z * [new branch] gh/justinchuby/115/head -> origin/gh/justinchuby/115/head 2025-09-07T07:46:33.5970531Z * [new branch] gh/justinchuby/115/orig -> origin/gh/justinchuby/115/orig 2025-09-07T07:46:33.5970761Z * [new branch] gh/karthickai/1/base -> origin/gh/karthickai/1/base 2025-09-07T07:46:33.5971002Z * [new branch] gh/karthickai/1/head -> origin/gh/karthickai/1/head 2025-09-07T07:46:33.5971229Z * [new branch] gh/karthickai/1/orig -> origin/gh/karthickai/1/orig 2025-09-07T07:46:33.5971463Z * [new branch] gh/karthickai/2/base -> origin/gh/karthickai/2/base 2025-09-07T07:46:33.5971685Z * [new branch] gh/karthickai/2/head -> origin/gh/karthickai/2/head 2025-09-07T07:46:33.5971923Z * [new branch] gh/karthickai/2/orig -> origin/gh/karthickai/2/orig 2025-09-07T07:46:33.5972251Z * [new branch] gh/kurtamohler/32/base -> origin/gh/kurtamohler/32/base 2025-09-07T07:46:33.5972493Z * [new branch] gh/kurtamohler/32/head -> origin/gh/kurtamohler/32/head 2025-09-07T07:46:33.5972743Z * [new branch] gh/kurtamohler/32/orig -> origin/gh/kurtamohler/32/orig 2025-09-07T07:46:33.5972975Z * [new branch] gh/kurtamohler/33/base -> origin/gh/kurtamohler/33/base 2025-09-07T07:46:33.5973223Z * [new branch] gh/kurtamohler/33/head -> origin/gh/kurtamohler/33/head 2025-09-07T07:46:33.5973463Z * [new branch] gh/kurtamohler/33/orig -> origin/gh/kurtamohler/33/orig 2025-09-07T07:46:33.5973699Z * [new branch] gh/kurtamohler/34/base -> origin/gh/kurtamohler/34/base 2025-09-07T07:46:33.5973946Z * [new branch] gh/kurtamohler/34/head -> origin/gh/kurtamohler/34/head 2025-09-07T07:46:33.5974181Z * [new branch] gh/kurtamohler/34/orig -> origin/gh/kurtamohler/34/orig 2025-09-07T07:46:33.5974425Z * [new branch] gh/kurtamohler/41/base -> origin/gh/kurtamohler/41/base 2025-09-07T07:46:33.5974666Z * [new branch] gh/kurtamohler/41/head -> origin/gh/kurtamohler/41/head 2025-09-07T07:46:33.5974916Z * [new branch] gh/kurtamohler/41/orig -> origin/gh/kurtamohler/41/orig 2025-09-07T07:46:33.5975153Z * [new branch] gh/kurtamohler/46/base -> origin/gh/kurtamohler/46/base 2025-09-07T07:46:33.5975388Z * [new branch] gh/kurtamohler/46/head -> origin/gh/kurtamohler/46/head 2025-09-07T07:46:33.5975634Z * [new branch] gh/kurtamohler/46/orig -> origin/gh/kurtamohler/46/orig 2025-09-07T07:46:33.5975870Z * [new branch] gh/kurtamohler/47/base -> origin/gh/kurtamohler/47/base 2025-09-07T07:46:33.5976115Z * [new branch] gh/kurtamohler/47/head -> origin/gh/kurtamohler/47/head 2025-09-07T07:46:33.5976356Z * [new branch] gh/kurtamohler/47/orig -> origin/gh/kurtamohler/47/orig 2025-09-07T07:46:33.5976691Z * [new branch] gh/kurtamohler/48/base -> origin/gh/kurtamohler/48/base 2025-09-07T07:46:33.5976932Z * [new branch] gh/kurtamohler/48/head -> origin/gh/kurtamohler/48/head 2025-09-07T07:46:33.5977169Z * [new branch] gh/kurtamohler/48/orig -> origin/gh/kurtamohler/48/orig 2025-09-07T07:46:33.5977504Z * [new branch] gh/kurtamohler/49/base -> origin/gh/kurtamohler/49/base 2025-09-07T07:46:33.5977747Z * [new branch] gh/kurtamohler/49/head -> origin/gh/kurtamohler/49/head 2025-09-07T07:46:33.5977991Z * [new branch] gh/kurtamohler/49/orig -> origin/gh/kurtamohler/49/orig 2025-09-07T07:46:33.5978227Z * [new branch] gh/kurtamohler/50/base -> origin/gh/kurtamohler/50/base 2025-09-07T07:46:33.5978461Z * [new branch] gh/kurtamohler/50/head -> origin/gh/kurtamohler/50/head 2025-09-07T07:46:33.5978715Z * [new branch] gh/kurtamohler/50/orig -> origin/gh/kurtamohler/50/orig 2025-09-07T07:46:33.5978937Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-09-07T07:46:33.5979169Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-09-07T07:46:33.5979386Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-09-07T07:46:33.5979611Z * [new branch] gh/kwen2501/15/base -> origin/gh/kwen2501/15/base 2025-09-07T07:46:33.5979828Z * [new branch] gh/kwen2501/15/head -> origin/gh/kwen2501/15/head 2025-09-07T07:46:33.5980046Z * [new branch] gh/kwen2501/156/base -> origin/gh/kwen2501/156/base 2025-09-07T07:46:33.5980274Z * [new branch] gh/kwen2501/156/head -> origin/gh/kwen2501/156/head 2025-09-07T07:46:33.5980625Z * [new branch] gh/kwen2501/156/orig -> origin/gh/kwen2501/156/orig 2025-09-07T07:46:33.5980848Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-09-07T07:46:33.5982750Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-09-07T07:46:33.5982977Z * [new branch] gh/kwen2501/186/base -> origin/gh/kwen2501/186/base 2025-09-07T07:46:33.5983199Z * [new branch] gh/kwen2501/186/head -> origin/gh/kwen2501/186/head 2025-09-07T07:46:33.5983414Z * [new branch] gh/kwen2501/186/orig -> origin/gh/kwen2501/186/orig 2025-09-07T07:46:33.5983645Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-09-07T07:46:33.5983862Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-09-07T07:46:33.5984090Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-09-07T07:46:33.5984313Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-09-07T07:46:33.5984542Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-09-07T07:46:33.5984758Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-09-07T07:46:33.5984976Z * [new branch] gh/kwen2501/194/base -> origin/gh/kwen2501/194/base 2025-09-07T07:46:33.5985201Z * [new branch] gh/kwen2501/194/head -> origin/gh/kwen2501/194/head 2025-09-07T07:46:33.5985418Z * [new branch] gh/kwen2501/194/orig -> origin/gh/kwen2501/194/orig 2025-09-07T07:46:33.5985643Z * [new branch] gh/kwen2501/199/base -> origin/gh/kwen2501/199/base 2025-09-07T07:46:33.5985857Z * [new branch] gh/kwen2501/199/head -> origin/gh/kwen2501/199/head 2025-09-07T07:46:33.5986076Z * [new branch] gh/kwen2501/199/orig -> origin/gh/kwen2501/199/orig 2025-09-07T07:46:33.5986309Z * [new branch] gh/kwen2501/200/base -> origin/gh/kwen2501/200/base 2025-09-07T07:46:33.5986639Z * [new branch] gh/kwen2501/200/head -> origin/gh/kwen2501/200/head 2025-09-07T07:46:33.5986877Z * [new branch] gh/kwen2501/200/orig -> origin/gh/kwen2501/200/orig 2025-09-07T07:46:33.5987092Z * [new branch] gh/kwen2501/201/base -> origin/gh/kwen2501/201/base 2025-09-07T07:46:33.5987314Z * [new branch] gh/kwen2501/201/head -> origin/gh/kwen2501/201/head 2025-09-07T07:46:33.5987533Z * [new branch] gh/kwen2501/201/orig -> origin/gh/kwen2501/201/orig 2025-09-07T07:46:33.5987749Z * [new branch] gh/kwen2501/203/base -> origin/gh/kwen2501/203/base 2025-09-07T07:46:33.5987978Z * [new branch] gh/kwen2501/203/head -> origin/gh/kwen2501/203/head 2025-09-07T07:46:33.5988197Z * [new branch] gh/kwen2501/203/orig -> origin/gh/kwen2501/203/orig 2025-09-07T07:46:33.5988430Z * [new branch] gh/kwen2501/204/base -> origin/gh/kwen2501/204/base 2025-09-07T07:46:33.5988652Z * [new branch] gh/kwen2501/204/head -> origin/gh/kwen2501/204/head 2025-09-07T07:46:33.5988870Z * [new branch] gh/kwen2501/204/orig -> origin/gh/kwen2501/204/orig 2025-09-07T07:46:33.5989099Z * [new branch] gh/kwen2501/205/base -> origin/gh/kwen2501/205/base 2025-09-07T07:46:33.5989316Z * [new branch] gh/kwen2501/205/head -> origin/gh/kwen2501/205/head 2025-09-07T07:46:33.5989538Z * [new branch] gh/kwen2501/205/orig -> origin/gh/kwen2501/205/orig 2025-09-07T07:46:33.5989757Z * [new branch] gh/kwen2501/206/base -> origin/gh/kwen2501/206/base 2025-09-07T07:46:33.5989985Z * [new branch] gh/kwen2501/206/head -> origin/gh/kwen2501/206/head 2025-09-07T07:46:33.5990301Z * [new branch] gh/kwen2501/206/orig -> origin/gh/kwen2501/206/orig 2025-09-07T07:46:33.5990518Z * [new branch] gh/kwen2501/207/base -> origin/gh/kwen2501/207/base 2025-09-07T07:46:33.5990754Z * [new branch] gh/kwen2501/207/head -> origin/gh/kwen2501/207/head 2025-09-07T07:46:33.5990971Z * [new branch] gh/kwen2501/207/orig -> origin/gh/kwen2501/207/orig 2025-09-07T07:46:33.5991199Z * [new branch] gh/kwen2501/208/base -> origin/gh/kwen2501/208/base 2025-09-07T07:46:33.5991417Z * [new branch] gh/kwen2501/208/head -> origin/gh/kwen2501/208/head 2025-09-07T07:46:33.5991647Z * [new branch] gh/kwen2501/208/orig -> origin/gh/kwen2501/208/orig 2025-09-07T07:46:33.5991868Z * [new branch] gh/kwen2501/209/base -> origin/gh/kwen2501/209/base 2025-09-07T07:46:33.5992085Z * [new branch] gh/kwen2501/209/head -> origin/gh/kwen2501/209/head 2025-09-07T07:46:33.5992320Z * [new branch] gh/kwen2501/209/orig -> origin/gh/kwen2501/209/orig 2025-09-07T07:46:33.5992537Z * [new branch] gh/kwen2501/210/base -> origin/gh/kwen2501/210/base 2025-09-07T07:46:33.5992765Z * [new branch] gh/kwen2501/210/head -> origin/gh/kwen2501/210/head 2025-09-07T07:46:33.5992982Z * [new branch] gh/kwen2501/210/orig -> origin/gh/kwen2501/210/orig 2025-09-07T07:46:33.5993198Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-09-07T07:46:33.5993427Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-09-07T07:46:33.5993642Z * [new branch] gh/kwen2501/212/base -> origin/gh/kwen2501/212/base 2025-09-07T07:46:33.5993873Z * [new branch] gh/kwen2501/212/head -> origin/gh/kwen2501/212/head 2025-09-07T07:46:33.5994090Z * [new branch] gh/kwen2501/212/orig -> origin/gh/kwen2501/212/orig 2025-09-07T07:46:33.5994324Z * [new branch] gh/kwen2501/213/base -> origin/gh/kwen2501/213/base 2025-09-07T07:46:33.5994655Z * [new branch] gh/kwen2501/213/head -> origin/gh/kwen2501/213/head 2025-09-07T07:46:33.5994874Z * [new branch] gh/kwen2501/213/orig -> origin/gh/kwen2501/213/orig 2025-09-07T07:46:33.5995102Z * [new branch] gh/kwen2501/214/base -> origin/gh/kwen2501/214/base 2025-09-07T07:46:33.5995319Z * [new branch] gh/kwen2501/214/head -> origin/gh/kwen2501/214/head 2025-09-07T07:46:33.5995549Z * [new branch] gh/kwen2501/214/orig -> origin/gh/kwen2501/214/orig 2025-09-07T07:46:33.5995764Z * [new branch] gh/kwen2501/215/base -> origin/gh/kwen2501/215/base 2025-09-07T07:46:33.5995980Z * [new branch] gh/kwen2501/215/head -> origin/gh/kwen2501/215/head 2025-09-07T07:46:33.5996212Z * [new branch] gh/kwen2501/215/orig -> origin/gh/kwen2501/215/orig 2025-09-07T07:46:33.5996432Z * [new branch] gh/kwen2501/216/base -> origin/gh/kwen2501/216/base 2025-09-07T07:46:33.5996659Z * [new branch] gh/kwen2501/216/head -> origin/gh/kwen2501/216/head 2025-09-07T07:46:33.5996878Z * [new branch] gh/kwen2501/216/orig -> origin/gh/kwen2501/216/orig 2025-09-07T07:46:33.5997108Z * [new branch] gh/kwen2501/217/base -> origin/gh/kwen2501/217/base 2025-09-07T07:46:33.5997327Z * [new branch] gh/kwen2501/217/head -> origin/gh/kwen2501/217/head 2025-09-07T07:46:33.5997541Z * [new branch] gh/kwen2501/217/orig -> origin/gh/kwen2501/217/orig 2025-09-07T07:46:33.5997767Z * [new branch] gh/kwen2501/218/base -> origin/gh/kwen2501/218/base 2025-09-07T07:46:33.5997982Z * [new branch] gh/kwen2501/218/head -> origin/gh/kwen2501/218/head 2025-09-07T07:46:33.5998301Z * [new branch] gh/kwen2501/218/orig -> origin/gh/kwen2501/218/orig 2025-09-07T07:46:33.5998517Z * [new branch] gh/kwen2501/219/base -> origin/gh/kwen2501/219/base 2025-09-07T07:46:33.5998750Z * [new branch] gh/kwen2501/219/head -> origin/gh/kwen2501/219/head 2025-09-07T07:46:33.5998967Z * [new branch] gh/kwen2501/219/orig -> origin/gh/kwen2501/219/orig 2025-09-07T07:46:33.5999183Z * [new branch] gh/kwen2501/220/base -> origin/gh/kwen2501/220/base 2025-09-07T07:46:33.5999409Z * [new branch] gh/kwen2501/220/head -> origin/gh/kwen2501/220/head 2025-09-07T07:46:33.5999626Z * [new branch] gh/kwen2501/220/orig -> origin/gh/kwen2501/220/orig 2025-09-07T07:46:33.5999855Z * [new branch] gh/kwen2501/221/base -> origin/gh/kwen2501/221/base 2025-09-07T07:46:33.6000071Z * [new branch] gh/kwen2501/221/head -> origin/gh/kwen2501/221/head 2025-09-07T07:46:33.6000294Z * [new branch] gh/kwen2501/221/orig -> origin/gh/kwen2501/221/orig 2025-09-07T07:46:33.6000520Z * [new branch] gh/kwen2501/222/base -> origin/gh/kwen2501/222/base 2025-09-07T07:46:33.6000736Z * [new branch] gh/kwen2501/222/head -> origin/gh/kwen2501/222/head 2025-09-07T07:46:33.6000960Z * [new branch] gh/kwen2501/222/orig -> origin/gh/kwen2501/222/orig 2025-09-07T07:46:33.6001173Z * [new branch] gh/kwen2501/223/base -> origin/gh/kwen2501/223/base 2025-09-07T07:46:33.6001401Z * [new branch] gh/kwen2501/223/head -> origin/gh/kwen2501/223/head 2025-09-07T07:46:33.6001615Z * [new branch] gh/kwen2501/223/orig -> origin/gh/kwen2501/223/orig 2025-09-07T07:46:33.6001834Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-09-07T07:46:33.6002062Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-09-07T07:46:33.6002282Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-09-07T07:46:33.6002602Z * [new branch] gh/kwen2501/225/base -> origin/gh/kwen2501/225/base 2025-09-07T07:46:33.6002957Z * [new branch] gh/kwen2501/225/head -> origin/gh/kwen2501/225/head 2025-09-07T07:46:33.6003178Z * [new branch] gh/kwen2501/225/orig -> origin/gh/kwen2501/225/orig 2025-09-07T07:46:33.6003406Z * [new branch] gh/kwen2501/226/base -> origin/gh/kwen2501/226/base 2025-09-07T07:46:33.6003622Z * [new branch] gh/kwen2501/226/head -> origin/gh/kwen2501/226/head 2025-09-07T07:46:33.6003847Z * [new branch] gh/kwen2501/226/orig -> origin/gh/kwen2501/226/orig 2025-09-07T07:46:33.6004063Z * [new branch] gh/kwen2501/227/base -> origin/gh/kwen2501/227/base 2025-09-07T07:46:33.6004292Z * [new branch] gh/kwen2501/227/head -> origin/gh/kwen2501/227/head 2025-09-07T07:46:33.6004512Z * [new branch] gh/kwen2501/227/orig -> origin/gh/kwen2501/227/orig 2025-09-07T07:46:33.6004726Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-09-07T07:46:33.6004956Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-09-07T07:46:33.6005170Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-09-07T07:46:33.6005398Z * [new branch] gh/kwen2501/229/base -> origin/gh/kwen2501/229/base 2025-09-07T07:46:33.6005612Z * [new branch] gh/kwen2501/229/head -> origin/gh/kwen2501/229/head 2025-09-07T07:46:33.6005846Z * [new branch] gh/kwen2501/229/orig -> origin/gh/kwen2501/229/orig 2025-09-07T07:46:33.6006062Z * [new branch] gh/kwen2501/230/base -> origin/gh/kwen2501/230/base 2025-09-07T07:46:33.6006426Z * [new branch] gh/kwen2501/230/head -> origin/gh/kwen2501/230/head 2025-09-07T07:46:33.6006654Z * [new branch] gh/kwen2501/230/orig -> origin/gh/kwen2501/230/orig 2025-09-07T07:46:33.6006875Z * [new branch] gh/kwen2501/231/base -> origin/gh/kwen2501/231/base 2025-09-07T07:46:33.6007103Z * [new branch] gh/kwen2501/231/head -> origin/gh/kwen2501/231/head 2025-09-07T07:46:33.6007318Z * [new branch] gh/kwen2501/231/orig -> origin/gh/kwen2501/231/orig 2025-09-07T07:46:33.6007531Z * [new branch] gh/kwen2501/232/base -> origin/gh/kwen2501/232/base 2025-09-07T07:46:33.6007757Z * [new branch] gh/kwen2501/232/head -> origin/gh/kwen2501/232/head 2025-09-07T07:46:33.6007975Z * [new branch] gh/kwen2501/232/orig -> origin/gh/kwen2501/232/orig 2025-09-07T07:46:33.6008225Z * [new branch] gh/laithsakka/156/base -> origin/gh/laithsakka/156/base 2025-09-07T07:46:33.6008468Z * [new branch] gh/laithsakka/156/head -> origin/gh/laithsakka/156/head 2025-09-07T07:46:33.6008716Z * [new branch] gh/laithsakka/156/orig -> origin/gh/laithsakka/156/orig 2025-09-07T07:46:33.6008953Z * [new branch] gh/laithsakka/160/base -> origin/gh/laithsakka/160/base 2025-09-07T07:46:33.6009190Z * [new branch] gh/laithsakka/160/head -> origin/gh/laithsakka/160/head 2025-09-07T07:46:33.6009441Z * [new branch] gh/laithsakka/160/orig -> origin/gh/laithsakka/160/orig 2025-09-07T07:46:33.6009676Z * [new branch] gh/laithsakka/178/base -> origin/gh/laithsakka/178/base 2025-09-07T07:46:33.6009914Z * [new branch] gh/laithsakka/178/head -> origin/gh/laithsakka/178/head 2025-09-07T07:46:33.6010148Z * [new branch] gh/laithsakka/178/orig -> origin/gh/laithsakka/178/orig 2025-09-07T07:46:33.6010392Z * [new branch] gh/laithsakka/191/base -> origin/gh/laithsakka/191/base 2025-09-07T07:46:33.6010630Z * [new branch] gh/laithsakka/191/head -> origin/gh/laithsakka/191/head 2025-09-07T07:46:33.6011048Z * [new branch] gh/laithsakka/191/orig -> origin/gh/laithsakka/191/orig 2025-09-07T07:46:33.6011300Z * [new branch] gh/laithsakka/237/base -> origin/gh/laithsakka/237/base 2025-09-07T07:46:33.6011536Z * [new branch] gh/laithsakka/237/head -> origin/gh/laithsakka/237/head 2025-09-07T07:46:33.6011782Z * [new branch] gh/laithsakka/237/orig -> origin/gh/laithsakka/237/orig 2025-09-07T07:46:33.6012010Z * [new branch] gh/laithsakka/249/base -> origin/gh/laithsakka/249/base 2025-09-07T07:46:33.6012246Z * [new branch] gh/laithsakka/249/head -> origin/gh/laithsakka/249/head 2025-09-07T07:46:33.6012491Z * [new branch] gh/laithsakka/249/orig -> origin/gh/laithsakka/249/orig 2025-09-07T07:46:33.6012731Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-09-07T07:46:33.6012979Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-09-07T07:46:33.6013216Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-09-07T07:46:33.6013466Z * [new branch] gh/laithsakka/254/base -> origin/gh/laithsakka/254/base 2025-09-07T07:46:33.6013699Z * [new branch] gh/laithsakka/254/head -> origin/gh/laithsakka/254/head 2025-09-07T07:46:33.6013930Z * [new branch] gh/laithsakka/254/orig -> origin/gh/laithsakka/254/orig 2025-09-07T07:46:33.6014179Z * [new branch] gh/laithsakka/255/base -> origin/gh/laithsakka/255/base 2025-09-07T07:46:33.6014414Z * [new branch] gh/laithsakka/255/head -> origin/gh/laithsakka/255/head 2025-09-07T07:46:33.6014664Z * [new branch] gh/laithsakka/255/orig -> origin/gh/laithsakka/255/orig 2025-09-07T07:46:33.6014990Z * [new branch] gh/laithsakka/256/base -> origin/gh/laithsakka/256/base 2025-09-07T07:46:33.6015241Z * [new branch] gh/laithsakka/256/head -> origin/gh/laithsakka/256/head 2025-09-07T07:46:33.6015479Z * [new branch] gh/laithsakka/256/orig -> origin/gh/laithsakka/256/orig 2025-09-07T07:46:33.6015711Z * [new branch] gh/laithsakka/257/base -> origin/gh/laithsakka/257/base 2025-09-07T07:46:33.6015947Z * [new branch] gh/laithsakka/257/head -> origin/gh/laithsakka/257/head 2025-09-07T07:46:33.6016180Z * [new branch] gh/laithsakka/257/orig -> origin/gh/laithsakka/257/orig 2025-09-07T07:46:33.6016421Z * [new branch] gh/laithsakka/258/base -> origin/gh/laithsakka/258/base 2025-09-07T07:46:33.6016653Z * [new branch] gh/laithsakka/258/head -> origin/gh/laithsakka/258/head 2025-09-07T07:46:33.6016890Z * [new branch] gh/laithsakka/258/orig -> origin/gh/laithsakka/258/orig 2025-09-07T07:46:33.6017143Z * [new branch] gh/laithsakka/259/base -> origin/gh/laithsakka/259/base 2025-09-07T07:46:33.6017476Z * [new branch] gh/laithsakka/259/head -> origin/gh/laithsakka/259/head 2025-09-07T07:46:33.6017726Z * [new branch] gh/laithsakka/259/orig -> origin/gh/laithsakka/259/orig 2025-09-07T07:46:33.6017960Z * [new branch] gh/laithsakka/260/base -> origin/gh/laithsakka/260/base 2025-09-07T07:46:33.6018206Z * [new branch] gh/laithsakka/260/head -> origin/gh/laithsakka/260/head 2025-09-07T07:46:33.6018442Z * [new branch] gh/laithsakka/260/orig -> origin/gh/laithsakka/260/orig 2025-09-07T07:46:33.6018677Z * [new branch] gh/laithsakka/261/base -> origin/gh/laithsakka/261/base 2025-09-07T07:46:33.6018924Z * [new branch] gh/laithsakka/261/head -> origin/gh/laithsakka/261/head 2025-09-07T07:46:33.6019162Z * [new branch] gh/laithsakka/261/orig -> origin/gh/laithsakka/261/orig 2025-09-07T07:46:33.6019404Z * [new branch] gh/laithsakka/262/base -> origin/gh/laithsakka/262/base 2025-09-07T07:46:33.6019738Z * [new branch] gh/laithsakka/262/head -> origin/gh/laithsakka/262/head 2025-09-07T07:46:33.6019991Z * [new branch] gh/laithsakka/262/orig -> origin/gh/laithsakka/262/orig 2025-09-07T07:46:33.6020227Z * [new branch] gh/laithsakka/263/base -> origin/gh/laithsakka/263/base 2025-09-07T07:46:33.6020461Z * [new branch] gh/laithsakka/263/head -> origin/gh/laithsakka/263/head 2025-09-07T07:46:33.6020707Z * [new branch] gh/laithsakka/263/orig -> origin/gh/laithsakka/263/orig 2025-09-07T07:46:33.6020942Z * [new branch] gh/laithsakka/264/base -> origin/gh/laithsakka/264/base 2025-09-07T07:46:33.6021190Z * [new branch] gh/laithsakka/264/head -> origin/gh/laithsakka/264/head 2025-09-07T07:46:33.6021424Z * [new branch] gh/laithsakka/264/orig -> origin/gh/laithsakka/264/orig 2025-09-07T07:46:33.6021664Z * [new branch] gh/laithsakka/265/base -> origin/gh/laithsakka/265/base 2025-09-07T07:46:33.6021906Z * [new branch] gh/laithsakka/265/head -> origin/gh/laithsakka/265/head 2025-09-07T07:46:33.6022140Z * [new branch] gh/laithsakka/265/orig -> origin/gh/laithsakka/265/orig 2025-09-07T07:46:33.6022382Z * [new branch] gh/laithsakka/266/base -> origin/gh/laithsakka/266/base 2025-09-07T07:46:33.6022615Z * [new branch] gh/laithsakka/266/head -> origin/gh/laithsakka/266/head 2025-09-07T07:46:33.6022856Z * [new branch] gh/laithsakka/266/orig -> origin/gh/laithsakka/266/orig 2025-09-07T07:46:33.6023092Z * [new branch] gh/laithsakka/267/base -> origin/gh/laithsakka/267/base 2025-09-07T07:46:33.6023423Z * [new branch] gh/laithsakka/267/head -> origin/gh/laithsakka/267/head 2025-09-07T07:46:33.6023667Z * [new branch] gh/laithsakka/267/orig -> origin/gh/laithsakka/267/orig 2025-09-07T07:46:33.6023903Z * [new branch] gh/laithsakka/268/base -> origin/gh/laithsakka/268/base 2025-09-07T07:46:33.6024149Z * [new branch] gh/laithsakka/268/head -> origin/gh/laithsakka/268/head 2025-09-07T07:46:33.6024380Z * [new branch] gh/laithsakka/268/orig -> origin/gh/laithsakka/268/orig 2025-09-07T07:46:33.6024623Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-09-07T07:46:33.6024855Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-09-07T07:46:33.6025084Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-09-07T07:46:33.6025326Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-09-07T07:46:33.6025557Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-09-07T07:46:33.6025803Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-09-07T07:46:33.6026027Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-09-07T07:46:33.6026257Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-09-07T07:46:33.6026499Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-09-07T07:46:33.6026729Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-09-07T07:46:33.6026975Z * [new branch] gh/lucaskabela/10/base -> origin/gh/lucaskabela/10/base 2025-09-07T07:46:33.6027211Z * [new branch] gh/lucaskabela/10/head -> origin/gh/lucaskabela/10/head 2025-09-07T07:46:33.6027455Z * [new branch] gh/lucaskabela/10/orig -> origin/gh/lucaskabela/10/orig 2025-09-07T07:46:33.6027695Z * [new branch] gh/lucaskabela/11/base -> origin/gh/lucaskabela/11/base 2025-09-07T07:46:33.6028045Z * [new branch] gh/lucaskabela/11/head -> origin/gh/lucaskabela/11/head 2025-09-07T07:46:33.6028295Z * [new branch] gh/lucaskabela/11/orig -> origin/gh/lucaskabela/11/orig 2025-09-07T07:46:33.6028531Z * [new branch] gh/lucaskabela/12/base -> origin/gh/lucaskabela/12/base 2025-09-07T07:46:33.6028771Z * [new branch] gh/lucaskabela/12/head -> origin/gh/lucaskabela/12/head 2025-09-07T07:46:33.6029006Z * [new branch] gh/lucaskabela/12/orig -> origin/gh/lucaskabela/12/orig 2025-09-07T07:46:33.6029252Z * [new branch] gh/lucaskabela/13/base -> origin/gh/lucaskabela/13/base 2025-09-07T07:46:33.6029489Z * [new branch] gh/lucaskabela/13/head -> origin/gh/lucaskabela/13/head 2025-09-07T07:46:33.6029728Z * [new branch] gh/lucaskabela/13/orig -> origin/gh/lucaskabela/13/orig 2025-09-07T07:46:33.6029974Z * [new branch] gh/lucaskabela/14/base -> origin/gh/lucaskabela/14/base 2025-09-07T07:46:33.6030211Z * [new branch] gh/lucaskabela/14/head -> origin/gh/lucaskabela/14/head 2025-09-07T07:46:33.6030457Z * [new branch] gh/lucaskabela/14/orig -> origin/gh/lucaskabela/14/orig 2025-09-07T07:46:33.6030688Z * [new branch] gh/lucaskabela/15/base -> origin/gh/lucaskabela/15/base 2025-09-07T07:46:33.6030919Z * [new branch] gh/lucaskabela/15/head -> origin/gh/lucaskabela/15/head 2025-09-07T07:46:33.6031167Z * [new branch] gh/lucaskabela/15/orig -> origin/gh/lucaskabela/15/orig 2025-09-07T07:46:33.6031404Z * [new branch] gh/lucaskabela/16/base -> origin/gh/lucaskabela/16/base 2025-09-07T07:46:33.6031650Z * [new branch] gh/lucaskabela/16/head -> origin/gh/lucaskabela/16/head 2025-09-07T07:46:33.6031979Z * [new branch] gh/lucaskabela/16/orig -> origin/gh/lucaskabela/16/orig 2025-09-07T07:46:33.6032226Z * [new branch] gh/lucaskabela/17/base -> origin/gh/lucaskabela/17/base 2025-09-07T07:46:33.6032461Z * [new branch] gh/lucaskabela/17/head -> origin/gh/lucaskabela/17/head 2025-09-07T07:46:33.6032697Z * [new branch] gh/lucaskabela/17/orig -> origin/gh/lucaskabela/17/orig 2025-09-07T07:46:33.6032936Z * [new branch] gh/lucaskabela/2/base -> origin/gh/lucaskabela/2/base 2025-09-07T07:46:33.6033171Z * [new branch] gh/lucaskabela/2/head -> origin/gh/lucaskabela/2/head 2025-09-07T07:46:33.6033413Z * [new branch] gh/lucaskabela/2/orig -> origin/gh/lucaskabela/2/orig 2025-09-07T07:46:33.6033649Z * [new branch] gh/lucaskabela/3/base -> origin/gh/lucaskabela/3/base 2025-09-07T07:46:33.6033891Z * [new branch] gh/lucaskabela/3/head -> origin/gh/lucaskabela/3/head 2025-09-07T07:46:33.6034131Z * [new branch] gh/lucaskabela/3/orig -> origin/gh/lucaskabela/3/orig 2025-09-07T07:46:33.6034366Z * [new branch] gh/lucaskabela/4/base -> origin/gh/lucaskabela/4/base 2025-09-07T07:46:33.6034614Z * [new branch] gh/lucaskabela/4/head -> origin/gh/lucaskabela/4/head 2025-09-07T07:46:33.6034849Z * [new branch] gh/lucaskabela/4/orig -> origin/gh/lucaskabela/4/orig 2025-09-07T07:46:33.6035094Z * [new branch] gh/lucaskabela/5/base -> origin/gh/lucaskabela/5/base 2025-09-07T07:46:33.6035327Z * [new branch] gh/lucaskabela/5/head -> origin/gh/lucaskabela/5/head 2025-09-07T07:46:33.6035560Z * [new branch] gh/lucaskabela/5/orig -> origin/gh/lucaskabela/5/orig 2025-09-07T07:46:33.6035801Z * [new branch] gh/lucaskabela/6/base -> origin/gh/lucaskabela/6/base 2025-09-07T07:46:33.6036037Z * [new branch] gh/lucaskabela/6/head -> origin/gh/lucaskabela/6/head 2025-09-07T07:46:33.6036281Z * [new branch] gh/lucaskabela/6/orig -> origin/gh/lucaskabela/6/orig 2025-09-07T07:46:33.6036594Z * [new branch] gh/lucaskabela/7/base -> origin/gh/lucaskabela/7/base 2025-09-07T07:46:33.6036840Z * [new branch] gh/lucaskabela/7/head -> origin/gh/lucaskabela/7/head 2025-09-07T07:46:33.6037070Z * [new branch] gh/lucaskabela/7/orig -> origin/gh/lucaskabela/7/orig 2025-09-07T07:46:33.6037303Z * [new branch] gh/lucaskabela/8/base -> origin/gh/lucaskabela/8/base 2025-09-07T07:46:33.6037542Z * [new branch] gh/lucaskabela/8/head -> origin/gh/lucaskabela/8/head 2025-09-07T07:46:33.6037771Z * [new branch] gh/lucaskabela/8/orig -> origin/gh/lucaskabela/8/orig 2025-09-07T07:46:33.6038015Z * [new branch] gh/lucaskabela/9/base -> origin/gh/lucaskabela/9/base 2025-09-07T07:46:33.6038252Z * [new branch] gh/lucaskabela/9/head -> origin/gh/lucaskabela/9/head 2025-09-07T07:46:33.6038485Z * [new branch] gh/lucaskabela/9/orig -> origin/gh/lucaskabela/9/orig 2025-09-07T07:46:33.6038692Z * [new branch] gh/lw/3/base -> origin/gh/lw/3/base 2025-09-07T07:46:33.6038881Z * [new branch] gh/lw/3/head -> origin/gh/lw/3/head 2025-09-07T07:46:33.6039081Z * [new branch] gh/lw/3/orig -> origin/gh/lw/3/orig 2025-09-07T07:46:33.6039295Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-09-07T07:46:33.6039523Z * [new branch] gh/malfet/330/base -> origin/gh/malfet/330/base 2025-09-07T07:46:33.6039735Z * [new branch] gh/malfet/330/head -> origin/gh/malfet/330/head 2025-09-07T07:46:33.6039946Z * [new branch] gh/malfet/330/orig -> origin/gh/malfet/330/orig 2025-09-07T07:46:33.6040254Z * [new branch] gh/malfet/396/base -> origin/gh/malfet/396/base 2025-09-07T07:46:33.6040468Z * [new branch] gh/malfet/396/head -> origin/gh/malfet/396/head 2025-09-07T07:46:33.6040691Z * [new branch] gh/malfet/396/orig -> origin/gh/malfet/396/orig 2025-09-07T07:46:33.6040902Z * [new branch] gh/malfet/397/base -> origin/gh/malfet/397/base 2025-09-07T07:46:33.6041124Z * [new branch] gh/malfet/397/head -> origin/gh/malfet/397/head 2025-09-07T07:46:33.6041336Z * [new branch] gh/malfet/397/orig -> origin/gh/malfet/397/orig 2025-09-07T07:46:33.6041547Z * [new branch] gh/malfet/398/base -> origin/gh/malfet/398/base 2025-09-07T07:46:33.6041768Z * [new branch] gh/malfet/398/head -> origin/gh/malfet/398/head 2025-09-07T07:46:33.6041975Z * [new branch] gh/malfet/398/orig -> origin/gh/malfet/398/orig 2025-09-07T07:46:33.6042205Z * [new branch] gh/malfet/399/base -> origin/gh/malfet/399/base 2025-09-07T07:46:33.6042424Z * [new branch] gh/malfet/399/head -> origin/gh/malfet/399/head 2025-09-07T07:46:33.6042637Z * [new branch] gh/malfet/399/orig -> origin/gh/malfet/399/orig 2025-09-07T07:46:33.6042993Z * [new branch] gh/malfet/414/base -> origin/gh/malfet/414/base 2025-09-07T07:46:33.6043208Z * [new branch] gh/malfet/414/head -> origin/gh/malfet/414/head 2025-09-07T07:46:33.6043432Z * [new branch] gh/malfet/414/orig -> origin/gh/malfet/414/orig 2025-09-07T07:46:33.6043644Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-09-07T07:46:33.6043870Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-09-07T07:46:33.6044079Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-09-07T07:46:33.6044296Z * [new branch] gh/malfet/418/base -> origin/gh/malfet/418/base 2025-09-07T07:46:33.6044647Z * [new branch] gh/malfet/418/head -> origin/gh/malfet/418/head 2025-09-07T07:46:33.6044861Z * [new branch] gh/malfet/418/orig -> origin/gh/malfet/418/orig 2025-09-07T07:46:33.6045086Z * [new branch] gh/malfet/475/base -> origin/gh/malfet/475/base 2025-09-07T07:46:33.6045300Z * [new branch] gh/malfet/475/head -> origin/gh/malfet/475/head 2025-09-07T07:46:33.6045512Z * [new branch] gh/malfet/475/orig -> origin/gh/malfet/475/orig 2025-09-07T07:46:33.6045733Z * [new branch] gh/malfet/476/base -> origin/gh/malfet/476/base 2025-09-07T07:46:33.6045947Z * [new branch] gh/malfet/476/head -> origin/gh/malfet/476/head 2025-09-07T07:46:33.6046170Z * [new branch] gh/malfet/476/orig -> origin/gh/malfet/476/orig 2025-09-07T07:46:33.6046380Z * [new branch] gh/malfet/477/base -> origin/gh/malfet/477/base 2025-09-07T07:46:33.6046606Z * [new branch] gh/malfet/477/head -> origin/gh/malfet/477/head 2025-09-07T07:46:33.6046818Z * [new branch] gh/malfet/477/orig -> origin/gh/malfet/477/orig 2025-09-07T07:46:33.6047025Z * [new branch] gh/malfet/478/base -> origin/gh/malfet/478/base 2025-09-07T07:46:33.6047249Z * [new branch] gh/malfet/478/head -> origin/gh/malfet/478/head 2025-09-07T07:46:33.6047457Z * [new branch] gh/malfet/478/orig -> origin/gh/malfet/478/orig 2025-09-07T07:46:33.6047684Z * [new branch] gh/malfet/479/base -> origin/gh/malfet/479/base 2025-09-07T07:46:33.6047898Z * [new branch] gh/malfet/479/head -> origin/gh/malfet/479/head 2025-09-07T07:46:33.6048123Z * [new branch] gh/malfet/479/orig -> origin/gh/malfet/479/orig 2025-09-07T07:46:33.6048462Z * [new branch] gh/malfet/480/base -> origin/gh/malfet/480/base 2025-09-07T07:46:33.6048673Z * [new branch] gh/malfet/480/head -> origin/gh/malfet/480/head 2025-09-07T07:46:33.6048898Z * [new branch] gh/malfet/480/orig -> origin/gh/malfet/480/orig 2025-09-07T07:46:33.6049120Z * [new branch] gh/malfet/481/base -> origin/gh/malfet/481/base 2025-09-07T07:46:33.6049345Z * [new branch] gh/malfet/481/head -> origin/gh/malfet/481/head 2025-09-07T07:46:33.6049556Z * [new branch] gh/malfet/481/orig -> origin/gh/malfet/481/orig 2025-09-07T07:46:33.6049768Z * [new branch] gh/malfet/482/base -> origin/gh/malfet/482/base 2025-09-07T07:46:33.6049987Z * [new branch] gh/malfet/482/head -> origin/gh/malfet/482/head 2025-09-07T07:46:33.6050204Z * [new branch] gh/malfet/482/orig -> origin/gh/malfet/482/orig 2025-09-07T07:46:33.6050426Z * [new branch] gh/malfet/483/base -> origin/gh/malfet/483/base 2025-09-07T07:46:33.6050642Z * [new branch] gh/malfet/483/head -> origin/gh/malfet/483/head 2025-09-07T07:46:33.6050855Z * [new branch] gh/malfet/483/orig -> origin/gh/malfet/483/orig 2025-09-07T07:46:33.6051072Z * [new branch] gh/malfet/484/base -> origin/gh/malfet/484/base 2025-09-07T07:46:33.6051284Z * [new branch] gh/malfet/484/head -> origin/gh/malfet/484/head 2025-09-07T07:46:33.6051500Z * [new branch] gh/malfet/484/orig -> origin/gh/malfet/484/orig 2025-09-07T07:46:33.6051710Z * [new branch] gh/malfet/485/base -> origin/gh/malfet/485/base 2025-09-07T07:46:33.6051926Z * [new branch] gh/malfet/485/head -> origin/gh/malfet/485/head 2025-09-07T07:46:33.6052137Z * [new branch] gh/malfet/485/orig -> origin/gh/malfet/485/orig 2025-09-07T07:46:33.6052347Z * [new branch] gh/malfet/486/base -> origin/gh/malfet/486/base 2025-09-07T07:46:33.7940753Z * [new branch] gh/malfet/486/head -> origin/gh/malfet/486/head 2025-09-07T07:46:33.7941622Z * [new branch] gh/malfet/486/orig -> origin/gh/malfet/486/orig 2025-09-07T07:46:33.7942172Z * [new branch] gh/malfet/487/base -> origin/gh/malfet/487/base 2025-09-07T07:46:33.7942725Z * [new branch] gh/malfet/487/head -> origin/gh/malfet/487/head 2025-09-07T07:46:33.7943270Z * [new branch] gh/malfet/487/orig -> origin/gh/malfet/487/orig 2025-09-07T07:46:33.7943808Z * [new branch] gh/malfet/488/base -> origin/gh/malfet/488/base 2025-09-07T07:46:33.7944346Z * [new branch] gh/malfet/488/head -> origin/gh/malfet/488/head 2025-09-07T07:46:33.7944892Z * [new branch] gh/malfet/488/orig -> origin/gh/malfet/488/orig 2025-09-07T07:46:33.7945433Z * [new branch] gh/malfet/489/base -> origin/gh/malfet/489/base 2025-09-07T07:46:33.7945976Z * [new branch] gh/malfet/489/head -> origin/gh/malfet/489/head 2025-09-07T07:46:33.7946518Z * [new branch] gh/malfet/489/orig -> origin/gh/malfet/489/orig 2025-09-07T07:46:33.7947054Z * [new branch] gh/malfet/490/base -> origin/gh/malfet/490/base 2025-09-07T07:46:33.7947584Z * [new branch] gh/malfet/490/head -> origin/gh/malfet/490/head 2025-09-07T07:46:33.7948123Z * [new branch] gh/malfet/490/orig -> origin/gh/malfet/490/orig 2025-09-07T07:46:33.7948664Z * [new branch] gh/malfet/491/base -> origin/gh/malfet/491/base 2025-09-07T07:46:33.7949207Z * [new branch] gh/malfet/491/head -> origin/gh/malfet/491/head 2025-09-07T07:46:33.7949934Z * [new branch] gh/malfet/491/orig -> origin/gh/malfet/491/orig 2025-09-07T07:46:33.7950477Z * [new branch] gh/malfet/492/base -> origin/gh/malfet/492/base 2025-09-07T07:46:33.7951023Z * [new branch] gh/malfet/492/head -> origin/gh/malfet/492/head 2025-09-07T07:46:33.7951565Z * [new branch] gh/malfet/492/orig -> origin/gh/malfet/492/orig 2025-09-07T07:46:33.7952108Z * [new branch] gh/malfet/493/base -> origin/gh/malfet/493/base 2025-09-07T07:46:33.7952644Z * [new branch] gh/malfet/493/head -> origin/gh/malfet/493/head 2025-09-07T07:46:33.7953187Z * [new branch] gh/malfet/493/orig -> origin/gh/malfet/493/orig 2025-09-07T07:46:33.7953729Z * [new branch] gh/malfet/494/base -> origin/gh/malfet/494/base 2025-09-07T07:46:33.7954271Z * [new branch] gh/malfet/494/head -> origin/gh/malfet/494/head 2025-09-07T07:46:33.7954816Z * [new branch] gh/malfet/494/orig -> origin/gh/malfet/494/orig 2025-09-07T07:46:33.7955351Z * [new branch] gh/malfet/495/base -> origin/gh/malfet/495/base 2025-09-07T07:46:33.7955889Z * [new branch] gh/malfet/495/head -> origin/gh/malfet/495/head 2025-09-07T07:46:33.7956430Z * [new branch] gh/malfet/495/orig -> origin/gh/malfet/495/orig 2025-09-07T07:46:33.7956977Z * [new branch] gh/malfet/496/base -> origin/gh/malfet/496/base 2025-09-07T07:46:33.7957629Z * [new branch] gh/malfet/496/head -> origin/gh/malfet/496/head 2025-09-07T07:46:33.7958167Z * [new branch] gh/malfet/496/orig -> origin/gh/malfet/496/orig 2025-09-07T07:46:33.7958716Z * [new branch] gh/malfet/497/base -> origin/gh/malfet/497/base 2025-09-07T07:46:33.7959262Z * [new branch] gh/malfet/497/head -> origin/gh/malfet/497/head 2025-09-07T07:46:33.7959814Z * [new branch] gh/malfet/497/orig -> origin/gh/malfet/497/orig 2025-09-07T07:46:33.7960471Z * [new branch] gh/malfet/498/base -> origin/gh/malfet/498/base 2025-09-07T07:46:33.7961007Z * [new branch] gh/malfet/498/head -> origin/gh/malfet/498/head 2025-09-07T07:46:33.7961551Z * [new branch] gh/malfet/498/orig -> origin/gh/malfet/498/orig 2025-09-07T07:46:33.7962091Z * [new branch] gh/malfet/499/base -> origin/gh/malfet/499/base 2025-09-07T07:46:33.7962632Z * [new branch] gh/malfet/499/head -> origin/gh/malfet/499/head 2025-09-07T07:46:33.7963434Z * [new branch] gh/malfet/499/orig -> origin/gh/malfet/499/orig 2025-09-07T07:46:33.7963981Z * [new branch] gh/malfet/500/base -> origin/gh/malfet/500/base 2025-09-07T07:46:33.7964522Z * [new branch] gh/malfet/500/head -> origin/gh/malfet/500/head 2025-09-07T07:46:33.7965071Z * [new branch] gh/malfet/500/orig -> origin/gh/malfet/500/orig 2025-09-07T07:46:33.7965617Z * [new branch] gh/malfet/501/base -> origin/gh/malfet/501/base 2025-09-07T07:46:33.7966150Z * [new branch] gh/malfet/501/head -> origin/gh/malfet/501/head 2025-09-07T07:46:33.7966697Z * [new branch] gh/malfet/501/orig -> origin/gh/malfet/501/orig 2025-09-07T07:46:33.7967239Z * [new branch] gh/malfet/502/base -> origin/gh/malfet/502/base 2025-09-07T07:46:33.7967779Z * [new branch] gh/malfet/502/head -> origin/gh/malfet/502/head 2025-09-07T07:46:33.7968320Z * [new branch] gh/malfet/502/orig -> origin/gh/malfet/502/orig 2025-09-07T07:46:33.7968846Z * [new branch] gh/malfet/503/base -> origin/gh/malfet/503/base 2025-09-07T07:46:33.7969381Z * [new branch] gh/malfet/503/head -> origin/gh/malfet/503/head 2025-09-07T07:46:33.7970043Z * [new branch] gh/malfet/503/orig -> origin/gh/malfet/503/orig 2025-09-07T07:46:33.7970594Z * [new branch] gh/malfet/504/base -> origin/gh/malfet/504/base 2025-09-07T07:46:33.7971138Z * [new branch] gh/malfet/504/head -> origin/gh/malfet/504/head 2025-09-07T07:46:33.7971664Z * [new branch] gh/malfet/504/orig -> origin/gh/malfet/504/orig 2025-09-07T07:46:33.7972203Z * [new branch] gh/malfet/505/base -> origin/gh/malfet/505/base 2025-09-07T07:46:33.7972754Z * [new branch] gh/malfet/505/head -> origin/gh/malfet/505/head 2025-09-07T07:46:33.7973298Z * [new branch] gh/malfet/505/orig -> origin/gh/malfet/505/orig 2025-09-07T07:46:33.7973843Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-09-07T07:46:33.7974374Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-09-07T07:46:33.7974932Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-09-07T07:46:33.7975482Z * [new branch] gh/malfet/507/base -> origin/gh/malfet/507/base 2025-09-07T07:46:33.7976037Z * [new branch] gh/malfet/507/head -> origin/gh/malfet/507/head 2025-09-07T07:46:33.7976591Z * [new branch] gh/malfet/507/orig -> origin/gh/malfet/507/orig 2025-09-07T07:46:33.7977129Z * [new branch] gh/malfet/508/base -> origin/gh/malfet/508/base 2025-09-07T07:46:33.7977779Z * [new branch] gh/malfet/508/head -> origin/gh/malfet/508/head 2025-09-07T07:46:33.7978331Z * [new branch] gh/malfet/508/orig -> origin/gh/malfet/508/orig 2025-09-07T07:46:33.7978882Z * [new branch] gh/malfet/509/base -> origin/gh/malfet/509/base 2025-09-07T07:46:33.7979417Z * [new branch] gh/malfet/509/head -> origin/gh/malfet/509/head 2025-09-07T07:46:33.7979970Z * [new branch] gh/malfet/509/orig -> origin/gh/malfet/509/orig 2025-09-07T07:46:33.7980664Z * [new branch] gh/malfet/510/base -> origin/gh/malfet/510/base 2025-09-07T07:46:33.7981239Z * [new branch] gh/malfet/510/head -> origin/gh/malfet/510/head 2025-09-07T07:46:33.7981800Z * [new branch] gh/malfet/510/orig -> origin/gh/malfet/510/orig 2025-09-07T07:46:33.7982339Z * [new branch] gh/malfet/511/base -> origin/gh/malfet/511/base 2025-09-07T07:46:33.7982895Z * [new branch] gh/malfet/511/head -> origin/gh/malfet/511/head 2025-09-07T07:46:33.7983453Z * [new branch] gh/malfet/511/orig -> origin/gh/malfet/511/orig 2025-09-07T07:46:33.7984002Z * [new branch] gh/malfet/512/base -> origin/gh/malfet/512/base 2025-09-07T07:46:33.7984556Z * [new branch] gh/malfet/512/head -> origin/gh/malfet/512/head 2025-09-07T07:46:33.7985093Z * [new branch] gh/malfet/512/orig -> origin/gh/malfet/512/orig 2025-09-07T07:46:33.7985652Z * [new branch] gh/malfet/513/base -> origin/gh/malfet/513/base 2025-09-07T07:46:33.7986209Z * [new branch] gh/malfet/513/head -> origin/gh/malfet/513/head 2025-09-07T07:46:33.7986758Z * [new branch] gh/malfet/513/orig -> origin/gh/malfet/513/orig 2025-09-07T07:46:33.7987308Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-09-07T07:46:33.7987845Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-09-07T07:46:33.7988446Z * [new branch] gh/manuelcandales/10/base -> origin/gh/manuelcandales/10/base 2025-09-07T07:46:33.7989093Z * [new branch] gh/manuelcandales/10/head -> origin/gh/manuelcandales/10/head 2025-09-07T07:46:33.7989843Z * [new branch] gh/manuelcandales/10/orig -> origin/gh/manuelcandales/10/orig 2025-09-07T07:46:33.7990481Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-09-07T07:46:33.7991115Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-09-07T07:46:33.7991755Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-09-07T07:46:33.7992404Z * [new branch] gh/manuelcandales/9/base -> origin/gh/manuelcandales/9/base 2025-09-07T07:46:33.7993043Z * [new branch] gh/manuelcandales/9/head -> origin/gh/manuelcandales/9/head 2025-09-07T07:46:33.7993675Z * [new branch] gh/manuelcandales/9/orig -> origin/gh/manuelcandales/9/orig 2025-09-07T07:46:33.7994255Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-09-07T07:46:33.7994824Z * [new branch] gh/masnesral/204/base -> origin/gh/masnesral/204/base 2025-09-07T07:46:33.7995413Z * [new branch] gh/masnesral/204/head -> origin/gh/masnesral/204/head 2025-09-07T07:46:33.7996001Z * [new branch] gh/masnesral/204/orig -> origin/gh/masnesral/204/orig 2025-09-07T07:46:33.7996570Z * [new branch] gh/masnesral/235/base -> origin/gh/masnesral/235/base 2025-09-07T07:46:33.7997153Z * [new branch] gh/masnesral/235/head -> origin/gh/masnesral/235/head 2025-09-07T07:46:33.7997737Z * [new branch] gh/masnesral/235/orig -> origin/gh/masnesral/235/orig 2025-09-07T07:46:33.7998315Z * [new branch] gh/masnesral/34/base -> origin/gh/masnesral/34/base 2025-09-07T07:46:33.7998893Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-09-07T07:46:33.7999451Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-09-07T07:46:33.8000018Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-09-07T07:46:33.8000585Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-09-07T07:46:33.8001241Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-09-07T07:46:33.8001814Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-09-07T07:46:33.8002370Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-09-07T07:46:33.8003071Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-09-07T07:46:33.8003645Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-09-07T07:46:33.8004216Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-09-07T07:46:33.8004784Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-09-07T07:46:33.8005344Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-09-07T07:46:33.8005912Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-09-07T07:46:33.8006485Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-09-07T07:46:33.8007117Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-09-07T07:46:33.8007810Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-09-07T07:46:33.8008481Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-09-07T07:46:33.8009161Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-09-07T07:46:33.8009840Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-09-07T07:46:33.8010518Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-09-07T07:46:33.8011339Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-09-07T07:46:33.8012017Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-09-07T07:46:33.8012704Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-09-07T07:46:33.8013391Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-09-07T07:46:33.8014080Z * [new branch] gh/mikaylagawarecki/317/base -> origin/gh/mikaylagawarecki/317/base 2025-09-07T07:46:33.8014768Z * [new branch] gh/mikaylagawarecki/317/head -> origin/gh/mikaylagawarecki/317/head 2025-09-07T07:46:33.8015436Z * [new branch] gh/mikaylagawarecki/317/orig -> origin/gh/mikaylagawarecki/317/orig 2025-09-07T07:46:33.8016117Z * [new branch] gh/mikaylagawarecki/320/base -> origin/gh/mikaylagawarecki/320/base 2025-09-07T07:46:33.8016804Z * [new branch] gh/mikaylagawarecki/320/head -> origin/gh/mikaylagawarecki/320/head 2025-09-07T07:46:33.8017567Z * [new branch] gh/mikaylagawarecki/320/orig -> origin/gh/mikaylagawarecki/320/orig 2025-09-07T07:46:33.8018258Z * [new branch] gh/mikaylagawarecki/329/base -> origin/gh/mikaylagawarecki/329/base 2025-09-07T07:46:33.8018929Z * [new branch] gh/mikaylagawarecki/329/head -> origin/gh/mikaylagawarecki/329/head 2025-09-07T07:46:33.8019610Z * [new branch] gh/mikaylagawarecki/329/orig -> origin/gh/mikaylagawarecki/329/orig 2025-09-07T07:46:33.8020295Z * [new branch] gh/mikaylagawarecki/330/base -> origin/gh/mikaylagawarecki/330/base 2025-09-07T07:46:33.8020977Z * [new branch] gh/mikaylagawarecki/330/head -> origin/gh/mikaylagawarecki/330/head 2025-09-07T07:46:33.8021659Z * [new branch] gh/mikaylagawarecki/330/orig -> origin/gh/mikaylagawarecki/330/orig 2025-09-07T07:46:33.8022334Z * [new branch] gh/mikaylagawarecki/331/base -> origin/gh/mikaylagawarecki/331/base 2025-09-07T07:46:33.8023139Z * [new branch] gh/mikaylagawarecki/331/head -> origin/gh/mikaylagawarecki/331/head 2025-09-07T07:46:33.8023824Z * [new branch] gh/mikaylagawarecki/331/orig -> origin/gh/mikaylagawarecki/331/orig 2025-09-07T07:46:33.8024506Z * [new branch] gh/mikaylagawarecki/332/base -> origin/gh/mikaylagawarecki/332/base 2025-09-07T07:46:33.8025189Z * [new branch] gh/mikaylagawarecki/332/head -> origin/gh/mikaylagawarecki/332/head 2025-09-07T07:46:33.8025856Z * [new branch] gh/mikaylagawarecki/332/orig -> origin/gh/mikaylagawarecki/332/orig 2025-09-07T07:46:33.8026535Z * [new branch] gh/mikaylagawarecki/334/base -> origin/gh/mikaylagawarecki/334/base 2025-09-07T07:46:33.8027215Z * [new branch] gh/mikaylagawarecki/334/head -> origin/gh/mikaylagawarecki/334/head 2025-09-07T07:46:33.8027901Z * [new branch] gh/mikaylagawarecki/334/orig -> origin/gh/mikaylagawarecki/334/orig 2025-09-07T07:46:33.8028587Z * [new branch] gh/mikaylagawarecki/335/base -> origin/gh/mikaylagawarecki/335/base 2025-09-07T07:46:33.8029256Z * [new branch] gh/mikaylagawarecki/335/head -> origin/gh/mikaylagawarecki/335/head 2025-09-07T07:46:33.8029940Z * [new branch] gh/mikaylagawarecki/335/orig -> origin/gh/mikaylagawarecki/335/orig 2025-09-07T07:46:33.8030620Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-09-07T07:46:33.8031300Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-09-07T07:46:33.8031980Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-09-07T07:46:33.8032644Z * [new branch] gh/mikaylagawarecki/337/base -> origin/gh/mikaylagawarecki/337/base 2025-09-07T07:46:33.8033460Z * [new branch] gh/mikaylagawarecki/337/head -> origin/gh/mikaylagawarecki/337/head 2025-09-07T07:46:33.8034148Z * [new branch] gh/mikaylagawarecki/337/orig -> origin/gh/mikaylagawarecki/337/orig 2025-09-07T07:46:33.8034830Z * [new branch] gh/mikaylagawarecki/338/base -> origin/gh/mikaylagawarecki/338/base 2025-09-07T07:46:33.8035512Z * [new branch] gh/mikaylagawarecki/338/head -> origin/gh/mikaylagawarecki/338/head 2025-09-07T07:46:33.8036195Z * [new branch] gh/mikaylagawarecki/338/orig -> origin/gh/mikaylagawarecki/338/orig 2025-09-07T07:46:33.8036875Z * [new branch] gh/mikaylagawarecki/339/base -> origin/gh/mikaylagawarecki/339/base 2025-09-07T07:46:33.8037556Z * [new branch] gh/mikaylagawarecki/339/head -> origin/gh/mikaylagawarecki/339/head 2025-09-07T07:46:33.8038231Z * [new branch] gh/mikaylagawarecki/339/orig -> origin/gh/mikaylagawarecki/339/orig 2025-09-07T07:46:33.8038834Z * [new branch] gh/mlazos/1/base -> origin/gh/mlazos/1/base 2025-09-07T07:46:33.8039374Z * [new branch] gh/mlazos/1/head -> origin/gh/mlazos/1/head 2025-09-07T07:46:33.8039914Z * [new branch] gh/mlazos/1/orig -> origin/gh/mlazos/1/orig 2025-09-07T07:46:33.8040462Z * [new branch] gh/mlazos/12/base -> origin/gh/mlazos/12/base 2025-09-07T07:46:33.8041004Z * [new branch] gh/mlazos/12/head -> origin/gh/mlazos/12/head 2025-09-07T07:46:33.8041533Z * [new branch] gh/mlazos/12/orig -> origin/gh/mlazos/12/orig 2025-09-07T07:46:33.8042075Z * [new branch] gh/mlazos/13/base -> origin/gh/mlazos/13/base 2025-09-07T07:46:33.8042616Z * [new branch] gh/mlazos/13/head -> origin/gh/mlazos/13/head 2025-09-07T07:46:33.8043281Z * [new branch] gh/mlazos/13/orig -> origin/gh/mlazos/13/orig 2025-09-07T07:46:33.8043822Z * [new branch] gh/mlazos/14/base -> origin/gh/mlazos/14/base 2025-09-07T07:46:33.8044353Z * [new branch] gh/mlazos/14/head -> origin/gh/mlazos/14/head 2025-09-07T07:46:33.8045012Z * [new branch] gh/mlazos/14/orig -> origin/gh/mlazos/14/orig 2025-09-07T07:46:33.8045559Z * [new branch] gh/mlazos/15/base -> origin/gh/mlazos/15/base 2025-09-07T07:46:33.8046104Z * [new branch] gh/mlazos/15/head -> origin/gh/mlazos/15/head 2025-09-07T07:46:33.8046630Z * [new branch] gh/mlazos/15/orig -> origin/gh/mlazos/15/orig 2025-09-07T07:46:33.8047177Z * [new branch] gh/mlazos/16/base -> origin/gh/mlazos/16/base 2025-09-07T07:46:33.8047723Z * [new branch] gh/mlazos/16/head -> origin/gh/mlazos/16/head 2025-09-07T07:46:33.8048266Z * [new branch] gh/mlazos/16/orig -> origin/gh/mlazos/16/orig 2025-09-07T07:46:33.8048817Z * [new branch] gh/mlazos/17/base -> origin/gh/mlazos/17/base 2025-09-07T07:46:33.8049347Z * [new branch] gh/mlazos/17/head -> origin/gh/mlazos/17/head 2025-09-07T07:46:33.8049893Z * [new branch] gh/mlazos/17/orig -> origin/gh/mlazos/17/orig 2025-09-07T07:46:33.8050430Z * [new branch] gh/mlazos/2/base -> origin/gh/mlazos/2/base 2025-09-07T07:46:33.8050959Z * [new branch] gh/mlazos/2/head -> origin/gh/mlazos/2/head 2025-09-07T07:46:33.8051492Z * [new branch] gh/mlazos/2/orig -> origin/gh/mlazos/2/orig 2025-09-07T07:46:33.8052008Z * [new branch] gh/mlazos/3/base -> origin/gh/mlazos/3/base 2025-09-07T07:46:33.8052537Z * [new branch] gh/mlazos/3/head -> origin/gh/mlazos/3/head 2025-09-07T07:46:33.8053069Z * [new branch] gh/mlazos/3/orig -> origin/gh/mlazos/3/orig 2025-09-07T07:46:33.8053738Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-09-07T07:46:33.8054283Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-09-07T07:46:33.8054840Z * [new branch] gh/muchulee8/62/base -> origin/gh/muchulee8/62/base 2025-09-07T07:46:33.8055410Z * [new branch] gh/muchulee8/62/head -> origin/gh/muchulee8/62/head 2025-09-07T07:46:33.8055980Z * [new branch] gh/muchulee8/62/orig -> origin/gh/muchulee8/62/orig 2025-09-07T07:46:33.8056549Z * [new branch] gh/muchulee8/63/base -> origin/gh/muchulee8/63/base 2025-09-07T07:46:33.8057116Z * [new branch] gh/muchulee8/63/head -> origin/gh/muchulee8/63/head 2025-09-07T07:46:33.8057758Z * [new branch] gh/muchulee8/63/orig -> origin/gh/muchulee8/63/orig 2025-09-07T07:46:33.8058325Z * [new branch] gh/muchulee8/64/base -> origin/gh/muchulee8/64/base 2025-09-07T07:46:33.8058899Z * [new branch] gh/muchulee8/64/head -> origin/gh/muchulee8/64/head 2025-09-07T07:46:33.8059471Z * [new branch] gh/muchulee8/64/orig -> origin/gh/muchulee8/64/orig 2025-09-07T07:46:33.8060028Z * [new branch] gh/muchulee8/65/base -> origin/gh/muchulee8/65/base 2025-09-07T07:46:33.8060594Z * [new branch] gh/muchulee8/65/head -> origin/gh/muchulee8/65/head 2025-09-07T07:46:33.8061160Z * [new branch] gh/muchulee8/65/orig -> origin/gh/muchulee8/65/orig 2025-09-07T07:46:33.8061763Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-09-07T07:46:33.8062394Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-09-07T07:46:33.8063011Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-09-07T07:46:33.8063635Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-09-07T07:46:33.8064264Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-09-07T07:46:33.8064981Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-09-07T07:46:33.8065616Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-09-07T07:46:33.8066232Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-09-07T07:46:33.8066860Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-09-07T07:46:33.8067488Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-09-07T07:46:33.8068118Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-09-07T07:46:33.8068746Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-09-07T07:46:33.8069363Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-09-07T07:46:33.8069993Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-09-07T07:46:33.8070618Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-09-07T07:46:33.8071242Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-09-07T07:46:33.8071865Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-09-07T07:46:33.8072478Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-09-07T07:46:33.8073064Z * [new branch] gh/oulgen/35/base -> origin/gh/oulgen/35/base 2025-09-07T07:46:33.8073607Z * [new branch] gh/oulgen/35/head -> origin/gh/oulgen/35/head 2025-09-07T07:46:33.8074151Z * [new branch] gh/oulgen/35/orig -> origin/gh/oulgen/35/orig 2025-09-07T07:46:33.8074788Z * [new branch] gh/oulgen/48/base -> origin/gh/oulgen/48/base 2025-09-07T07:46:33.8075319Z * [new branch] gh/oulgen/48/head -> origin/gh/oulgen/48/head 2025-09-07T07:46:33.8075858Z * [new branch] gh/oulgen/48/orig -> origin/gh/oulgen/48/orig 2025-09-07T07:46:33.8076399Z * [new branch] gh/oulgen/49/base -> origin/gh/oulgen/49/base 2025-09-07T07:46:33.8076942Z * [new branch] gh/oulgen/49/head -> origin/gh/oulgen/49/head 2025-09-07T07:46:33.8077483Z * [new branch] gh/oulgen/49/orig -> origin/gh/oulgen/49/orig 2025-09-07T07:46:33.8078016Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-09-07T07:46:33.8078561Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-09-07T07:46:33.8079100Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-09-07T07:46:33.8079651Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-09-07T07:46:33.8080181Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-09-07T07:46:33.8080725Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-09-07T07:46:33.8081269Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-09-07T07:46:33.8081812Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-09-07T07:46:33.8082354Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-09-07T07:46:33.8082998Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-09-07T07:46:33.8083547Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-09-07T07:46:33.8084090Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-09-07T07:46:33.8084638Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-09-07T07:46:33.8085325Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-09-07T07:46:33.8085863Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-09-07T07:46:33.8086407Z * [new branch] gh/pearu/113/base -> origin/gh/pearu/113/base 2025-09-07T07:46:33.8086949Z * [new branch] gh/pearu/113/head -> origin/gh/pearu/113/head 2025-09-07T07:46:33.8087491Z * [new branch] gh/pearu/113/orig -> origin/gh/pearu/113/orig 2025-09-07T07:46:33.8088037Z * [new branch] gh/pearu/114/base -> origin/gh/pearu/114/base 2025-09-07T07:46:33.8088568Z * [new branch] gh/pearu/114/head -> origin/gh/pearu/114/head 2025-09-07T07:46:33.8089113Z * [new branch] gh/pearu/114/orig -> origin/gh/pearu/114/orig 2025-09-07T07:46:33.8089661Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-09-07T07:46:33.8090207Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-09-07T07:46:33.8090741Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-09-07T07:46:33.8091283Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-09-07T07:46:33.8091828Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-09-07T07:46:33.8092371Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-09-07T07:46:33.8092918Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-09-07T07:46:33.8093444Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-09-07T07:46:33.8093989Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-09-07T07:46:33.8094648Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-09-07T07:46:33.8095194Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-09-07T07:46:33.8095736Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-09-07T07:46:33.8096265Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-09-07T07:46:33.8096803Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-09-07T07:46:33.8097436Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-09-07T07:46:33.8097984Z * [new branch] gh/qqaatw/29/base -> origin/gh/qqaatw/29/base 2025-09-07T07:46:33.8098534Z * [new branch] gh/qqaatw/29/head -> origin/gh/qqaatw/29/head 2025-09-07T07:46:33.8099066Z * [new branch] gh/qqaatw/29/orig -> origin/gh/qqaatw/29/orig 2025-09-07T07:46:33.8099660Z * [new branch] gh/raymo/refresh-script -> origin/gh/raymo/refresh-script 2025-09-07T07:46:33.8100240Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-09-07T07:46:33.8100763Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-09-07T07:46:33.8101270Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-09-07T07:46:33.8101792Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-09-07T07:46:33.8102307Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-09-07T07:46:33.8102823Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-09-07T07:46:33.8103337Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-09-07T07:46:33.8103838Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-09-07T07:46:33.8104362Z * [new branch] gh/rec/156/base -> origin/gh/rec/156/base 2025-09-07T07:46:33.8105023Z * [new branch] gh/rec/156/head -> origin/gh/rec/156/head 2025-09-07T07:46:33.8105546Z * [new branch] gh/rec/156/orig -> origin/gh/rec/156/orig 2025-09-07T07:46:33.8106067Z * [new branch] gh/rec/160/base -> origin/gh/rec/160/base 2025-09-07T07:46:33.8106569Z * [new branch] gh/rec/160/head -> origin/gh/rec/160/head 2025-09-07T07:46:33.8107088Z * [new branch] gh/rec/160/orig -> origin/gh/rec/160/orig 2025-09-07T07:46:33.8107603Z * [new branch] gh/rec/162/base -> origin/gh/rec/162/base 2025-09-07T07:46:33.8108116Z * [new branch] gh/rec/162/head -> origin/gh/rec/162/head 2025-09-07T07:46:33.8108631Z * [new branch] gh/rec/162/orig -> origin/gh/rec/162/orig 2025-09-07T07:46:33.8109139Z * [new branch] gh/rec/163/base -> origin/gh/rec/163/base 2025-09-07T07:46:33.8109658Z * [new branch] gh/rec/163/head -> origin/gh/rec/163/head 2025-09-07T07:46:33.8110177Z * [new branch] gh/rec/163/orig -> origin/gh/rec/163/orig 2025-09-07T07:46:33.8110698Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-09-07T07:46:33.8111209Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-09-07T07:46:33.8111712Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-09-07T07:46:33.8112234Z * [new branch] gh/rec/165/base -> origin/gh/rec/165/base 2025-09-07T07:46:33.8112747Z * [new branch] gh/rec/165/head -> origin/gh/rec/165/head 2025-09-07T07:46:33.8113266Z * [new branch] gh/rec/165/orig -> origin/gh/rec/165/orig 2025-09-07T07:46:33.8113876Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-09-07T07:46:33.8114399Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-09-07T07:46:33.8114926Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-09-07T07:46:33.8115508Z * [new branch] gh/robert-hardwick/1/base -> origin/gh/robert-hardwick/1/base 2025-09-07T07:46:33.8116151Z * [new branch] gh/robert-hardwick/1/head -> origin/gh/robert-hardwick/1/head 2025-09-07T07:46:33.8116770Z * [new branch] gh/robert-hardwick/1/orig -> origin/gh/robert-hardwick/1/orig 2025-09-07T07:46:33.8117391Z * [new branch] gh/robert-hardwick/2/base -> origin/gh/robert-hardwick/2/base 2025-09-07T07:46:33.8118014Z * [new branch] gh/robert-hardwick/2/head -> origin/gh/robert-hardwick/2/head 2025-09-07T07:46:33.8118640Z * [new branch] gh/robert-hardwick/2/orig -> origin/gh/robert-hardwick/2/orig 2025-09-07T07:46:33.8119270Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-09-07T07:46:33.8119889Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-09-07T07:46:33.8120511Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-09-07T07:46:33.8121135Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-09-07T07:46:33.8121757Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-09-07T07:46:33.8122383Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-09-07T07:46:33.8123073Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-09-07T07:46:33.8123618Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-09-07T07:46:33.8124168Z * [new branch] gh/rtimpe/10/base -> origin/gh/rtimpe/10/base 2025-09-07T07:46:33.8124718Z * [new branch] gh/rtimpe/10/head -> origin/gh/rtimpe/10/head 2025-09-07T07:46:33.8125382Z * [new branch] gh/rtimpe/10/orig -> origin/gh/rtimpe/10/orig 2025-09-07T07:46:33.8125920Z * [new branch] gh/rtimpe/11/base -> origin/gh/rtimpe/11/base 2025-09-07T07:46:33.8126466Z * [new branch] gh/rtimpe/11/head -> origin/gh/rtimpe/11/head 2025-09-07T07:46:33.8127009Z * [new branch] gh/rtimpe/11/orig -> origin/gh/rtimpe/11/orig 2025-09-07T07:46:33.8127554Z * [new branch] gh/rtimpe/12/base -> origin/gh/rtimpe/12/base 2025-09-07T07:46:33.8128087Z * [new branch] gh/rtimpe/12/head -> origin/gh/rtimpe/12/head 2025-09-07T07:46:33.8128630Z * [new branch] gh/rtimpe/12/orig -> origin/gh/rtimpe/12/orig 2025-09-07T07:46:33.8129178Z * [new branch] gh/rtimpe/13/base -> origin/gh/rtimpe/13/base 2025-09-07T07:46:33.8129716Z * [new branch] gh/rtimpe/13/head -> origin/gh/rtimpe/13/head 2025-09-07T07:46:33.8130261Z * [new branch] gh/rtimpe/13/orig -> origin/gh/rtimpe/13/orig 2025-09-07T07:46:33.8130793Z * [new branch] gh/rtimpe/14/base -> origin/gh/rtimpe/14/base 2025-09-07T07:46:33.8131338Z * [new branch] gh/rtimpe/14/head -> origin/gh/rtimpe/14/head 2025-09-07T07:46:33.8131883Z * [new branch] gh/rtimpe/14/orig -> origin/gh/rtimpe/14/orig 2025-09-07T07:46:33.8132425Z * [new branch] gh/rtimpe/15/base -> origin/gh/rtimpe/15/base 2025-09-07T07:46:33.8132974Z * [new branch] gh/rtimpe/15/head -> origin/gh/rtimpe/15/head 2025-09-07T07:46:33.8133507Z * [new branch] gh/rtimpe/15/orig -> origin/gh/rtimpe/15/orig 2025-09-07T07:46:33.8134171Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-09-07T07:46:33.8134707Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-09-07T07:46:33.8135240Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-09-07T07:46:33.8135775Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-09-07T07:46:33.8136298Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-09-07T07:46:33.8136836Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-09-07T07:46:33.8137459Z * [new branch] gh/rtimpe/9/base -> origin/gh/rtimpe/9/base 2025-09-07T07:46:33.8137999Z * [new branch] gh/rtimpe/9/head -> origin/gh/rtimpe/9/head 2025-09-07T07:46:33.8138537Z * [new branch] gh/rtimpe/9/orig -> origin/gh/rtimpe/9/orig 2025-09-07T07:46:33.8139110Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-09-07T07:46:33.8139725Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-09-07T07:46:33.8140337Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-09-07T07:46:33.8140948Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-09-07T07:46:33.8141543Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-09-07T07:46:33.8142150Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-09-07T07:46:33.8142755Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-09-07T07:46:33.8143361Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-09-07T07:46:33.8143970Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-09-07T07:46:33.8144562Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-09-07T07:46:33.8145267Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-09-07T07:46:33.8145882Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-09-07T07:46:33.8146492Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-09-07T07:46:33.8147100Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-09-07T07:46:33.8147698Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-09-07T07:46:33.8148305Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-09-07T07:46:33.8148915Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-09-07T07:46:33.8149527Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-09-07T07:46:33.8150136Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-09-07T07:46:33.8150734Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-09-07T07:46:33.8151338Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-09-07T07:46:33.8151912Z * [new branch] gh/sarckk/2/base -> origin/gh/sarckk/2/base 2025-09-07T07:46:33.8152451Z * [new branch] gh/sarckk/2/head -> origin/gh/sarckk/2/head 2025-09-07T07:46:33.8152985Z * [new branch] gh/sarckk/2/orig -> origin/gh/sarckk/2/orig 2025-09-07T07:46:33.8153539Z * [new branch] gh/seemethere/35/base -> origin/gh/seemethere/35/base 2025-09-07T07:46:33.8154126Z * [new branch] gh/seemethere/35/head -> origin/gh/seemethere/35/head 2025-09-07T07:46:33.8154815Z * [new branch] gh/seemethere/35/orig -> origin/gh/seemethere/35/orig 2025-09-07T07:46:33.8155401Z * [new branch] gh/seemethere/37/base -> origin/gh/seemethere/37/base 2025-09-07T07:46:33.8155992Z * [new branch] gh/seemethere/37/head -> origin/gh/seemethere/37/head 2025-09-07T07:46:33.8156562Z * [new branch] gh/seemethere/37/orig -> origin/gh/seemethere/37/orig 2025-09-07T07:46:33.8157147Z * [new branch] gh/seemethere/43/base -> origin/gh/seemethere/43/base 2025-09-07T07:46:33.8157727Z * [new branch] gh/seemethere/43/head -> origin/gh/seemethere/43/head 2025-09-07T07:46:33.8158318Z * [new branch] gh/seemethere/43/orig -> origin/gh/seemethere/43/orig 2025-09-07T07:46:33.8158902Z * [new branch] gh/seemethere/44/base -> origin/gh/seemethere/44/base 2025-09-07T07:46:33.8159537Z * [new branch] gh/seemethere/44/head -> origin/gh/seemethere/44/head 2025-09-07T07:46:33.8160122Z * [new branch] gh/seemethere/44/orig -> origin/gh/seemethere/44/orig 2025-09-07T07:46:33.8160714Z * [new branch] gh/seemethere/48/base -> origin/gh/seemethere/48/base 2025-09-07T07:46:33.8161303Z * [new branch] gh/seemethere/48/head -> origin/gh/seemethere/48/head 2025-09-07T07:46:33.8161873Z * [new branch] gh/seemethere/48/orig -> origin/gh/seemethere/48/orig 2025-09-07T07:46:33.8162458Z * [new branch] gh/seemethere/49/base -> origin/gh/seemethere/49/base 2025-09-07T07:46:33.8163163Z * [new branch] gh/seemethere/49/head -> origin/gh/seemethere/49/head 2025-09-07T07:46:33.8163749Z * [new branch] gh/seemethere/49/orig -> origin/gh/seemethere/49/orig 2025-09-07T07:46:33.8164334Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-09-07T07:46:33.8164905Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-09-07T07:46:33.8165500Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-09-07T07:46:33.8166282Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-09-07T07:46:33.8166877Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-09-07T07:46:33.8167473Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-09-07T07:46:33.8168052Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-09-07T07:46:33.8168646Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-09-07T07:46:33.8169236Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-09-07T07:46:33.8169822Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-09-07T07:46:33.8170415Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-09-07T07:46:33.8170986Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-09-07T07:46:33.8171573Z * [new branch] gh/seemethere/56/base -> origin/gh/seemethere/56/base 2025-09-07T07:46:33.8172163Z * [new branch] gh/seemethere/56/head -> origin/gh/seemethere/56/head 2025-09-07T07:46:33.8172749Z * [new branch] gh/seemethere/56/orig -> origin/gh/seemethere/56/orig 2025-09-07T07:46:33.8173334Z * [new branch] gh/seemethere/57/base -> origin/gh/seemethere/57/base 2025-09-07T07:46:33.8173906Z * [new branch] gh/seemethere/57/head -> origin/gh/seemethere/57/head 2025-09-07T07:46:33.8174492Z * [new branch] gh/seemethere/57/orig -> origin/gh/seemethere/57/orig 2025-09-07T07:46:33.8175076Z * [new branch] gh/seemethere/58/base -> origin/gh/seemethere/58/base 2025-09-07T07:46:33.8175762Z * [new branch] gh/seemethere/58/head -> origin/gh/seemethere/58/head 2025-09-07T07:46:33.8176353Z * [new branch] gh/seemethere/58/orig -> origin/gh/seemethere/58/orig 2025-09-07T07:46:33.8176929Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-09-07T07:46:33.8177601Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-09-07T07:46:33.8178190Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-09-07T07:46:33.8178777Z * [new branch] gh/seemethere/60/base -> origin/gh/seemethere/60/base 2025-09-07T07:46:33.8179348Z * [new branch] gh/seemethere/60/head -> origin/gh/seemethere/60/head 2025-09-07T07:46:33.8179933Z * [new branch] gh/seemethere/60/orig -> origin/gh/seemethere/60/orig 2025-09-07T07:46:33.8180520Z * [new branch] gh/seemethere/61/base -> origin/gh/seemethere/61/base 2025-09-07T07:46:33.8181108Z * [new branch] gh/seemethere/61/head -> origin/gh/seemethere/61/head 2025-09-07T07:46:33.8181695Z * [new branch] gh/seemethere/61/orig -> origin/gh/seemethere/61/orig 2025-09-07T07:46:33.8182266Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-09-07T07:46:33.8182850Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-09-07T07:46:33.8183438Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-09-07T07:46:33.8184016Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-09-07T07:46:33.8184603Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-09-07T07:46:33.8185173Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-09-07T07:46:33.8185773Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-09-07T07:46:33.8186379Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-09-07T07:46:33.8187122Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-09-07T07:46:33.8187726Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-09-07T07:46:33.8188318Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-09-07T07:46:33.8188918Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-09-07T07:46:33.8189516Z * [new branch] gh/shunting314/211/base -> origin/gh/shunting314/211/base 2025-09-07T07:46:33.8190112Z * [new branch] gh/shunting314/211/head -> origin/gh/shunting314/211/head 2025-09-07T07:46:33.8190709Z * [new branch] gh/shunting314/211/orig -> origin/gh/shunting314/211/orig 2025-09-07T07:46:33.8191297Z * [new branch] gh/shunting314/212/base -> origin/gh/shunting314/212/base 2025-09-07T07:46:33.8191901Z * [new branch] gh/shunting314/212/head -> origin/gh/shunting314/212/head 2025-09-07T07:46:33.8192497Z * [new branch] gh/shunting314/212/orig -> origin/gh/shunting314/212/orig 2025-09-07T07:46:33.8193090Z * [new branch] gh/shunting314/213/base -> origin/gh/shunting314/213/base 2025-09-07T07:46:33.8193686Z * [new branch] gh/shunting314/213/head -> origin/gh/shunting314/213/head 2025-09-07T07:46:33.8194276Z * [new branch] gh/shunting314/213/orig -> origin/gh/shunting314/213/orig 2025-09-07T07:46:33.8194880Z * [new branch] gh/shunting314/214/base -> origin/gh/shunting314/214/base 2025-09-07T07:46:33.8195478Z * [new branch] gh/shunting314/214/head -> origin/gh/shunting314/214/head 2025-09-07T07:46:33.8196174Z * [new branch] gh/shunting314/214/orig -> origin/gh/shunting314/214/orig 2025-09-07T07:46:33.8196776Z * [new branch] gh/shunting314/215/base -> origin/gh/shunting314/215/base 2025-09-07T07:46:33.8197365Z * [new branch] gh/shunting314/215/head -> origin/gh/shunting314/215/head 2025-09-07T07:46:33.8197968Z * [new branch] gh/shunting314/215/orig -> origin/gh/shunting314/215/orig 2025-09-07T07:46:33.8198569Z * [new branch] gh/shunting314/216/base -> origin/gh/shunting314/216/base 2025-09-07T07:46:33.8199171Z * [new branch] gh/shunting314/216/head -> origin/gh/shunting314/216/head 2025-09-07T07:46:33.8199772Z * [new branch] gh/shunting314/216/orig -> origin/gh/shunting314/216/orig 2025-09-07T07:46:33.8200355Z * [new branch] gh/shunting314/217/base -> origin/gh/shunting314/217/base 2025-09-07T07:46:33.8200955Z * [new branch] gh/shunting314/217/head -> origin/gh/shunting314/217/head 2025-09-07T07:46:33.8201563Z * [new branch] gh/shunting314/217/orig -> origin/gh/shunting314/217/orig 2025-09-07T07:46:33.8202166Z * [new branch] gh/shunting314/218/base -> origin/gh/shunting314/218/base 2025-09-07T07:46:33.8202770Z * [new branch] gh/shunting314/218/head -> origin/gh/shunting314/218/head 2025-09-07T07:46:33.8203479Z * [new branch] gh/shunting314/218/orig -> origin/gh/shunting314/218/orig 2025-09-07T07:46:33.8204083Z * [new branch] gh/shunting314/219/base -> origin/gh/shunting314/219/base 2025-09-07T07:46:33.8212688Z * [new branch] gh/shunting314/219/head -> origin/gh/shunting314/219/head 2025-09-07T07:46:33.8213488Z * [new branch] gh/shunting314/219/orig -> origin/gh/shunting314/219/orig 2025-09-07T07:46:33.8214092Z * [new branch] gh/shunting314/220/base -> origin/gh/shunting314/220/base 2025-09-07T07:46:33.8214704Z * [new branch] gh/shunting314/220/head -> origin/gh/shunting314/220/head 2025-09-07T07:46:33.8215308Z * [new branch] gh/shunting314/220/orig -> origin/gh/shunting314/220/orig 2025-09-07T07:46:33.8216133Z * [new branch] gh/shunting314/221/base -> origin/gh/shunting314/221/base 2025-09-07T07:46:33.8216746Z * [new branch] gh/shunting314/221/head -> origin/gh/shunting314/221/head 2025-09-07T07:46:33.8217434Z * [new branch] gh/shunting314/221/orig -> origin/gh/shunting314/221/orig 2025-09-07T07:46:33.8218041Z * [new branch] gh/shunting314/222/base -> origin/gh/shunting314/222/base 2025-09-07T07:46:33.8218647Z * [new branch] gh/shunting314/222/head -> origin/gh/shunting314/222/head 2025-09-07T07:46:33.8219247Z * [new branch] gh/shunting314/222/orig -> origin/gh/shunting314/222/orig 2025-09-07T07:46:33.8219848Z * [new branch] gh/shunting314/223/base -> origin/gh/shunting314/223/base 2025-09-07T07:46:33.8220435Z * [new branch] gh/shunting314/223/head -> origin/gh/shunting314/223/head 2025-09-07T07:46:33.8221038Z * [new branch] gh/shunting314/223/orig -> origin/gh/shunting314/223/orig 2025-09-07T07:46:33.8221630Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-09-07T07:46:33.8222210Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-09-07T07:46:33.8222769Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-09-07T07:46:33.8223342Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-09-07T07:46:33.8223910Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-09-07T07:46:33.8224474Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-09-07T07:46:33.8225041Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-09-07T07:46:33.8225726Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-09-07T07:46:33.8226313Z * [new branch] gh/sinhaanhsul/1/base -> origin/gh/sinhaanhsul/1/base 2025-09-07T07:46:33.8226906Z * [new branch] gh/sinhaanhsul/1/head -> origin/gh/sinhaanhsul/1/head 2025-09-07T07:46:33.8227492Z * [new branch] gh/skarjala/17/base -> origin/gh/skarjala/17/base 2025-09-07T07:46:33.8228053Z * [new branch] gh/skarjala/17/head -> origin/gh/skarjala/17/head 2025-09-07T07:46:33.8228603Z * [new branch] gh/skarjala/17/orig -> origin/gh/skarjala/17/orig 2025-09-07T07:46:33.8229164Z * [new branch] gh/skarjala/18/base -> origin/gh/skarjala/18/base 2025-09-07T07:46:33.8229723Z * [new branch] gh/skarjala/18/head -> origin/gh/skarjala/18/head 2025-09-07T07:46:33.8230290Z * [new branch] gh/skarjala/18/orig -> origin/gh/skarjala/18/orig 2025-09-07T07:46:33.8230848Z * [new branch] gh/skarjala/19/base -> origin/gh/skarjala/19/base 2025-09-07T07:46:33.8231402Z * [new branch] gh/skarjala/19/head -> origin/gh/skarjala/19/head 2025-09-07T07:46:33.8231964Z * [new branch] gh/skarjala/19/orig -> origin/gh/skarjala/19/orig 2025-09-07T07:46:33.8232528Z * [new branch] gh/slayton58/1/base -> origin/gh/slayton58/1/base 2025-09-07T07:46:33.8233091Z * [new branch] gh/slayton58/1/head -> origin/gh/slayton58/1/head 2025-09-07T07:46:33.8233656Z * [new branch] gh/slayton58/1/orig -> origin/gh/slayton58/1/orig 2025-09-07T07:46:33.8234208Z * [new branch] gh/slayton58/2/base -> origin/gh/slayton58/2/base 2025-09-07T07:46:33.8234770Z * [new branch] gh/slayton58/2/head -> origin/gh/slayton58/2/head 2025-09-07T07:46:33.8235338Z * [new branch] gh/slayton58/2/orig -> origin/gh/slayton58/2/orig 2025-09-07T07:46:33.8235902Z * [new branch] gh/slayton58/3/base -> origin/gh/slayton58/3/base 2025-09-07T07:46:33.8236570Z * [new branch] gh/slayton58/3/head -> origin/gh/slayton58/3/head 2025-09-07T07:46:33.8237132Z * [new branch] gh/slayton58/3/orig -> origin/gh/slayton58/3/orig 2025-09-07T07:46:33.8237701Z * [new branch] gh/slayton58/4/base -> origin/gh/slayton58/4/base 2025-09-07T07:46:33.8238264Z * [new branch] gh/slayton58/4/head -> origin/gh/slayton58/4/head 2025-09-07T07:46:33.8238820Z * [new branch] gh/slayton58/4/orig -> origin/gh/slayton58/4/orig 2025-09-07T07:46:33.8239378Z * [new branch] gh/slayton58/5/base -> origin/gh/slayton58/5/base 2025-09-07T07:46:33.8239926Z * [new branch] gh/slayton58/5/head -> origin/gh/slayton58/5/head 2025-09-07T07:46:33.8240490Z * [new branch] gh/slayton58/5/orig -> origin/gh/slayton58/5/orig 2025-09-07T07:46:33.8241066Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-09-07T07:46:33.8241651Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-09-07T07:46:33.8242236Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-09-07T07:46:33.8242807Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-09-07T07:46:33.8243534Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-09-07T07:46:33.8243776Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-09-07T07:46:33.8244003Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-09-07T07:46:33.8244228Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-09-07T07:46:33.8244579Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-09-07T07:46:33.8244808Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-09-07T07:46:33.8245046Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-09-07T07:46:33.8245268Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-09-07T07:46:33.8245504Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-09-07T07:46:33.8245729Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-09-07T07:46:33.8245954Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-09-07T07:46:33.8246189Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-09-07T07:46:33.8246411Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-09-07T07:46:33.8246654Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-09-07T07:46:33.8246880Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-09-07T07:46:33.8247107Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-09-07T07:46:33.8247345Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-09-07T07:46:33.8247571Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-09-07T07:46:33.8247806Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-09-07T07:46:33.8248026Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-09-07T07:46:33.8248259Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-09-07T07:46:33.8248485Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-09-07T07:46:33.8248707Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-09-07T07:46:33.8249070Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-09-07T07:46:33.8249297Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-09-07T07:46:33.8249530Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-09-07T07:46:33.8249754Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-09-07T07:46:33.8249977Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-09-07T07:46:33.8250217Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-09-07T07:46:33.8250440Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-09-07T07:46:33.8250681Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-09-07T07:46:33.8250914Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-09-07T07:46:33.8251153Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-09-07T07:46:33.8251379Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-09-07T07:46:33.8251602Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-09-07T07:46:33.8251840Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-09-07T07:46:33.8252063Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-09-07T07:46:33.8252302Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-09-07T07:46:33.8252526Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-09-07T07:46:33.8252887Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-09-07T07:46:33.8253115Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-09-07T07:46:33.8253340Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-09-07T07:46:33.8253575Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-09-07T07:46:33.8253797Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-09-07T07:46:33.8254032Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-09-07T07:46:33.8254254Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-09-07T07:46:33.8254474Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-09-07T07:46:33.8254718Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-09-07T07:46:33.8254944Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-09-07T07:46:33.8255181Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-09-07T07:46:33.8255406Z * [new branch] gh/soulitzer/362/base -> origin/gh/soulitzer/362/base 2025-09-07T07:46:33.8255643Z * [new branch] gh/soulitzer/362/head -> origin/gh/soulitzer/362/head 2025-09-07T07:46:33.8255867Z * [new branch] gh/soulitzer/362/orig -> origin/gh/soulitzer/362/orig 2025-09-07T07:46:33.8256092Z * [new branch] gh/soulitzer/372/base -> origin/gh/soulitzer/372/base 2025-09-07T07:46:33.8256329Z * [new branch] gh/soulitzer/372/head -> origin/gh/soulitzer/372/head 2025-09-07T07:46:33.8256554Z * [new branch] gh/soulitzer/372/orig -> origin/gh/soulitzer/372/orig 2025-09-07T07:46:33.8256791Z * [new branch] gh/soulitzer/373/base -> origin/gh/soulitzer/373/base 2025-09-07T07:46:33.8257098Z * [new branch] gh/soulitzer/373/head -> origin/gh/soulitzer/373/head 2025-09-07T07:46:33.8257415Z * [new branch] gh/soulitzer/373/orig -> origin/gh/soulitzer/373/orig 2025-09-07T07:46:33.8257663Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-09-07T07:46:33.8257889Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-09-07T07:46:33.8258134Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-09-07T07:46:33.8258360Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-09-07T07:46:33.8258600Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-09-07T07:46:33.8258825Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-09-07T07:46:33.8259055Z * [new branch] gh/soulitzer/376/base -> origin/gh/soulitzer/376/base 2025-09-07T07:46:33.8259298Z * [new branch] gh/soulitzer/376/head -> origin/gh/soulitzer/376/head 2025-09-07T07:46:33.8259525Z * [new branch] gh/soulitzer/376/orig -> origin/gh/soulitzer/376/orig 2025-09-07T07:46:33.8259768Z * [new branch] gh/soulitzer/377/base -> origin/gh/soulitzer/377/base 2025-09-07T07:46:33.8259992Z * [new branch] gh/soulitzer/377/head -> origin/gh/soulitzer/377/head 2025-09-07T07:46:33.8260228Z * [new branch] gh/soulitzer/377/orig -> origin/gh/soulitzer/377/orig 2025-09-07T07:46:33.8260453Z * [new branch] gh/soulitzer/378/base -> origin/gh/soulitzer/378/base 2025-09-07T07:46:33.8260675Z * [new branch] gh/soulitzer/378/head -> origin/gh/soulitzer/378/head 2025-09-07T07:46:33.8261010Z * [new branch] gh/soulitzer/378/orig -> origin/gh/soulitzer/378/orig 2025-09-07T07:46:33.8261234Z * [new branch] gh/soulitzer/379/base -> origin/gh/soulitzer/379/base 2025-09-07T07:46:33.8261473Z * [new branch] gh/soulitzer/379/head -> origin/gh/soulitzer/379/head 2025-09-07T07:46:33.8261697Z * [new branch] gh/soulitzer/379/orig -> origin/gh/soulitzer/379/orig 2025-09-07T07:46:33.8261920Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-09-07T07:46:33.8262155Z * [new branch] gh/swolchok/767/base -> origin/gh/swolchok/767/base 2025-09-07T07:46:33.8262374Z * [new branch] gh/swolchok/767/head -> origin/gh/swolchok/767/head 2025-09-07T07:46:33.8262603Z * [new branch] gh/swolchok/767/orig -> origin/gh/swolchok/767/orig 2025-09-07T07:46:33.8262821Z * [new branch] gh/swolchok/768/base -> origin/gh/swolchok/768/base 2025-09-07T07:46:33.8263056Z * [new branch] gh/swolchok/768/head -> origin/gh/swolchok/768/head 2025-09-07T07:46:33.8263277Z * [new branch] gh/swolchok/768/orig -> origin/gh/swolchok/768/orig 2025-09-07T07:46:33.8263495Z * [new branch] gh/swolchok/769/base -> origin/gh/swolchok/769/base 2025-09-07T07:46:33.8263727Z * [new branch] gh/swolchok/769/head -> origin/gh/swolchok/769/head 2025-09-07T07:46:33.8263945Z * [new branch] gh/swolchok/769/orig -> origin/gh/swolchok/769/orig 2025-09-07T07:46:33.8264178Z * [new branch] gh/swolchok/771/base -> origin/gh/swolchok/771/base 2025-09-07T07:46:33.8264393Z * [new branch] gh/swolchok/771/head -> origin/gh/swolchok/771/head 2025-09-07T07:46:33.8264610Z * [new branch] gh/swolchok/771/orig -> origin/gh/swolchok/771/orig 2025-09-07T07:46:33.8264839Z * [new branch] gh/swolchok/772/base -> origin/gh/swolchok/772/base 2025-09-07T07:46:33.8265061Z * [new branch] gh/swolchok/772/head -> origin/gh/swolchok/772/head 2025-09-07T07:46:33.8265380Z * [new branch] gh/swolchok/772/orig -> origin/gh/swolchok/772/orig 2025-09-07T07:46:33.8265600Z * [new branch] gh/swolchok/773/base -> origin/gh/swolchok/773/base 2025-09-07T07:46:33.8265832Z * [new branch] gh/swolchok/773/head -> origin/gh/swolchok/773/head 2025-09-07T07:46:33.8266047Z * [new branch] gh/swolchok/773/orig -> origin/gh/swolchok/773/orig 2025-09-07T07:46:33.8266264Z * [new branch] gh/swolchok/786/base -> origin/gh/swolchok/786/base 2025-09-07T07:46:33.8266494Z * [new branch] gh/swolchok/786/head -> origin/gh/swolchok/786/head 2025-09-07T07:46:33.8266711Z * [new branch] gh/swolchok/786/orig -> origin/gh/swolchok/786/orig 2025-09-07T07:46:33.8266947Z * [new branch] gh/swolchok/787/base -> origin/gh/swolchok/787/base 2025-09-07T07:46:33.8267163Z * [new branch] gh/swolchok/787/head -> origin/gh/swolchok/787/head 2025-09-07T07:46:33.8267397Z * [new branch] gh/swolchok/787/orig -> origin/gh/swolchok/787/orig 2025-09-07T07:46:33.8267613Z * [new branch] gh/swolchok/788/base -> origin/gh/swolchok/788/base 2025-09-07T07:46:33.8267831Z * [new branch] gh/swolchok/788/head -> origin/gh/swolchok/788/head 2025-09-07T07:46:33.8268063Z * [new branch] gh/swolchok/788/orig -> origin/gh/swolchok/788/orig 2025-09-07T07:46:33.8268280Z * [new branch] gh/swolchok/789/base -> origin/gh/swolchok/789/base 2025-09-07T07:46:33.8268511Z * [new branch] gh/swolchok/789/head -> origin/gh/swolchok/789/head 2025-09-07T07:46:33.8268730Z * [new branch] gh/swolchok/789/orig -> origin/gh/swolchok/789/orig 2025-09-07T07:46:33.8269030Z * [new branch] gh/swolchok/790/base -> origin/gh/swolchok/790/base 2025-09-07T07:46:33.8269258Z * [new branch] gh/swolchok/790/head -> origin/gh/swolchok/790/head 2025-09-07T07:46:33.8269479Z * [new branch] gh/swolchok/790/orig -> origin/gh/swolchok/790/orig 2025-09-07T07:46:33.8269708Z * [new branch] gh/swolchok/791/base -> origin/gh/swolchok/791/base 2025-09-07T07:46:33.8269925Z * [new branch] gh/swolchok/791/head -> origin/gh/swolchok/791/head 2025-09-07T07:46:33.8270154Z * [new branch] gh/swolchok/791/orig -> origin/gh/swolchok/791/orig 2025-09-07T07:46:33.8270371Z * [new branch] gh/swolchok/792/base -> origin/gh/swolchok/792/base 2025-09-07T07:46:33.8270589Z * [new branch] gh/swolchok/792/head -> origin/gh/swolchok/792/head 2025-09-07T07:46:33.8270816Z * [new branch] gh/swolchok/792/orig -> origin/gh/swolchok/792/orig 2025-09-07T07:46:33.8271037Z * [new branch] gh/swolchok/793/base -> origin/gh/swolchok/793/base 2025-09-07T07:46:33.8271271Z * [new branch] gh/swolchok/793/head -> origin/gh/swolchok/793/head 2025-09-07T07:46:33.8271490Z * [new branch] gh/swolchok/793/orig -> origin/gh/swolchok/793/orig 2025-09-07T07:46:33.8271706Z * [new branch] gh/swolchok/794/base -> origin/gh/swolchok/794/base 2025-09-07T07:46:33.8271936Z * [new branch] gh/swolchok/794/head -> origin/gh/swolchok/794/head 2025-09-07T07:46:33.8272155Z * [new branch] gh/swolchok/794/orig -> origin/gh/swolchok/794/orig 2025-09-07T07:46:33.8272385Z * [new branch] gh/swolchok/795/base -> origin/gh/swolchok/795/base 2025-09-07T07:46:33.8272605Z * [new branch] gh/swolchok/795/head -> origin/gh/swolchok/795/head 2025-09-07T07:46:33.8272838Z * [new branch] gh/swolchok/795/orig -> origin/gh/swolchok/795/orig 2025-09-07T07:46:33.8273058Z * [new branch] gh/swolchok/796/base -> origin/gh/swolchok/796/base 2025-09-07T07:46:33.8273361Z * [new branch] gh/swolchok/796/head -> origin/gh/swolchok/796/head 2025-09-07T07:46:33.8273594Z * [new branch] gh/swolchok/796/orig -> origin/gh/swolchok/796/orig 2025-09-07T07:46:33.8273811Z * [new branch] gh/swolchok/797/base -> origin/gh/swolchok/797/base 2025-09-07T07:46:33.8274043Z * [new branch] gh/swolchok/797/head -> origin/gh/swolchok/797/head 2025-09-07T07:46:33.8274260Z * [new branch] gh/swolchok/797/orig -> origin/gh/swolchok/797/orig 2025-09-07T07:46:33.8274488Z * [new branch] gh/swolchok/798/base -> origin/gh/swolchok/798/base 2025-09-07T07:46:33.8274707Z * [new branch] gh/swolchok/798/head -> origin/gh/swolchok/798/head 2025-09-07T07:46:33.8274929Z * [new branch] gh/swolchok/798/orig -> origin/gh/swolchok/798/orig 2025-09-07T07:46:33.8275160Z * [new branch] gh/swolchok/799/base -> origin/gh/swolchok/799/base 2025-09-07T07:46:33.8275379Z * [new branch] gh/swolchok/799/head -> origin/gh/swolchok/799/head 2025-09-07T07:46:33.8275611Z * [new branch] gh/swolchok/799/orig -> origin/gh/swolchok/799/orig 2025-09-07T07:46:33.8275829Z * [new branch] gh/swolchok/800/base -> origin/gh/swolchok/800/base 2025-09-07T07:46:33.8276047Z * [new branch] gh/swolchok/800/head -> origin/gh/swolchok/800/head 2025-09-07T07:46:33.8276280Z * [new branch] gh/swolchok/800/orig -> origin/gh/swolchok/800/orig 2025-09-07T07:46:33.8276501Z * [new branch] gh/swolchok/801/base -> origin/gh/swolchok/801/base 2025-09-07T07:46:33.8276730Z * [new branch] gh/swolchok/801/head -> origin/gh/swolchok/801/head 2025-09-07T07:46:33.8277032Z * [new branch] gh/swolchok/801/orig -> origin/gh/swolchok/801/orig 2025-09-07T07:46:33.8277261Z * [new branch] gh/swolchok/802/base -> origin/gh/swolchok/802/base 2025-09-07T07:46:33.8277481Z * [new branch] gh/swolchok/802/head -> origin/gh/swolchok/802/head 2025-09-07T07:46:33.8277700Z * [new branch] gh/swolchok/802/orig -> origin/gh/swolchok/802/orig 2025-09-07T07:46:33.8277930Z * [new branch] gh/swolchok/803/base -> origin/gh/swolchok/803/base 2025-09-07T07:46:33.8278148Z * [new branch] gh/swolchok/803/head -> origin/gh/swolchok/803/head 2025-09-07T07:46:33.8278375Z * [new branch] gh/swolchok/803/orig -> origin/gh/swolchok/803/orig 2025-09-07T07:46:33.8278592Z * [new branch] gh/swolchok/804/base -> origin/gh/swolchok/804/base 2025-09-07T07:46:33.8278826Z * [new branch] gh/swolchok/804/head -> origin/gh/swolchok/804/head 2025-09-07T07:46:33.8279048Z * [new branch] gh/swolchok/804/orig -> origin/gh/swolchok/804/orig 2025-09-07T07:46:33.8279270Z * [new branch] gh/swolchok/805/base -> origin/gh/swolchok/805/base 2025-09-07T07:46:33.8279502Z * [new branch] gh/swolchok/805/head -> origin/gh/swolchok/805/head 2025-09-07T07:46:33.8279721Z * [new branch] gh/swolchok/805/orig -> origin/gh/swolchok/805/orig 2025-09-07T07:46:33.8279950Z * [new branch] gh/swolchok/806/base -> origin/gh/swolchok/806/base 2025-09-07T07:46:33.8280166Z * [new branch] gh/swolchok/806/head -> origin/gh/swolchok/806/head 2025-09-07T07:46:33.8280384Z * [new branch] gh/swolchok/806/orig -> origin/gh/swolchok/806/orig 2025-09-07T07:46:33.8280614Z * [new branch] gh/swolchok/807/base -> origin/gh/swolchok/807/base 2025-09-07T07:46:33.8280832Z * [new branch] gh/swolchok/807/head -> origin/gh/swolchok/807/head 2025-09-07T07:46:33.8281066Z * [new branch] gh/swolchok/807/orig -> origin/gh/swolchok/807/orig 2025-09-07T07:46:33.8281366Z * [new branch] gh/swolchok/808/base -> origin/gh/swolchok/808/base 2025-09-07T07:46:33.8281600Z * [new branch] gh/swolchok/808/head -> origin/gh/swolchok/808/head 2025-09-07T07:46:33.8281817Z * [new branch] gh/swolchok/808/orig -> origin/gh/swolchok/808/orig 2025-09-07T07:46:33.8282036Z * [new branch] gh/swolchok/809/base -> origin/gh/swolchok/809/base 2025-09-07T07:46:33.8282267Z * [new branch] gh/swolchok/809/head -> origin/gh/swolchok/809/head 2025-09-07T07:46:33.8282485Z * [new branch] gh/swolchok/809/orig -> origin/gh/swolchok/809/orig 2025-09-07T07:46:33.8282716Z * [new branch] gh/swolchok/810/base -> origin/gh/swolchok/810/base 2025-09-07T07:46:33.8283075Z * [new branch] gh/swolchok/810/head -> origin/gh/swolchok/810/head 2025-09-07T07:46:33.8283293Z * [new branch] gh/swolchok/810/orig -> origin/gh/swolchok/810/orig 2025-09-07T07:46:33.8283531Z * [new branch] gh/swolchok/811/base -> origin/gh/swolchok/811/base 2025-09-07T07:46:33.8283750Z * [new branch] gh/swolchok/811/head -> origin/gh/swolchok/811/head 2025-09-07T07:46:33.8283982Z * [new branch] gh/swolchok/811/orig -> origin/gh/swolchok/811/orig 2025-09-07T07:46:33.8284201Z * [new branch] gh/swolchok/812/base -> origin/gh/swolchok/812/base 2025-09-07T07:46:33.8284432Z * [new branch] gh/swolchok/812/head -> origin/gh/swolchok/812/head 2025-09-07T07:46:33.8284652Z * [new branch] gh/swolchok/812/orig -> origin/gh/swolchok/812/orig 2025-09-07T07:46:33.8284871Z * [new branch] gh/swolchok/813/base -> origin/gh/swolchok/813/base 2025-09-07T07:46:33.8285238Z * [new branch] gh/swolchok/813/head -> origin/gh/swolchok/813/head 2025-09-07T07:46:33.8285459Z * [new branch] gh/swolchok/813/orig -> origin/gh/swolchok/813/orig 2025-09-07T07:46:33.8285698Z * [new branch] gh/swolchok/814/base -> origin/gh/swolchok/814/base 2025-09-07T07:46:33.8285919Z * [new branch] gh/swolchok/814/head -> origin/gh/swolchok/814/head 2025-09-07T07:46:33.8286154Z * [new branch] gh/swolchok/814/orig -> origin/gh/swolchok/814/orig 2025-09-07T07:46:33.8286371Z * [new branch] gh/swolchok/815/base -> origin/gh/swolchok/815/base 2025-09-07T07:46:33.8286588Z * [new branch] gh/swolchok/815/head -> origin/gh/swolchok/815/head 2025-09-07T07:46:33.8286819Z * [new branch] gh/swolchok/815/orig -> origin/gh/swolchok/815/orig 2025-09-07T07:46:33.8287038Z * [new branch] gh/swolchok/816/base -> origin/gh/swolchok/816/base 2025-09-07T07:46:33.8287274Z * [new branch] gh/swolchok/816/head -> origin/gh/swolchok/816/head 2025-09-07T07:46:33.8287496Z * [new branch] gh/swolchok/816/orig -> origin/gh/swolchok/816/orig 2025-09-07T07:46:33.8287713Z * [new branch] gh/swolchok/817/base -> origin/gh/swolchok/817/base 2025-09-07T07:46:33.8287940Z * [new branch] gh/swolchok/817/head -> origin/gh/swolchok/817/head 2025-09-07T07:46:33.8288155Z * [new branch] gh/swolchok/817/orig -> origin/gh/swolchok/817/orig 2025-09-07T07:46:33.8288384Z * [new branch] gh/swolchok/818/base -> origin/gh/swolchok/818/base 2025-09-07T07:46:33.8288601Z * [new branch] gh/swolchok/818/head -> origin/gh/swolchok/818/head 2025-09-07T07:46:33.8288829Z * [new branch] gh/swolchok/818/orig -> origin/gh/swolchok/818/orig 2025-09-07T07:46:33.8289046Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-09-07T07:46:33.8289265Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-09-07T07:46:33.8289649Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-09-07T07:46:33.8289869Z * [new branch] gh/swolchok/820/base -> origin/gh/swolchok/820/base 2025-09-07T07:46:33.8290099Z * [new branch] gh/swolchok/820/head -> origin/gh/swolchok/820/head 2025-09-07T07:46:33.8290315Z * [new branch] gh/swolchok/820/orig -> origin/gh/swolchok/820/orig 2025-09-07T07:46:33.8290533Z * [new branch] gh/swolchok/821/base -> origin/gh/swolchok/821/base 2025-09-07T07:46:33.8290762Z * [new branch] gh/swolchok/821/head -> origin/gh/swolchok/821/head 2025-09-07T07:46:33.8290980Z * [new branch] gh/swolchok/821/orig -> origin/gh/swolchok/821/orig 2025-09-07T07:46:33.8291212Z * [new branch] gh/swolchok/822/base -> origin/gh/swolchok/822/base 2025-09-07T07:46:33.8291433Z * [new branch] gh/swolchok/822/head -> origin/gh/swolchok/822/head 2025-09-07T07:46:33.8291666Z * [new branch] gh/swolchok/822/orig -> origin/gh/swolchok/822/orig 2025-09-07T07:46:33.8291882Z * [new branch] gh/swolchok/823/base -> origin/gh/swolchok/823/base 2025-09-07T07:46:33.8292099Z * [new branch] gh/swolchok/823/head -> origin/gh/swolchok/823/head 2025-09-07T07:46:33.8292326Z * [new branch] gh/swolchok/823/orig -> origin/gh/swolchok/823/orig 2025-09-07T07:46:33.8292545Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-09-07T07:46:33.8292773Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-09-07T07:46:33.8292991Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-09-07T07:46:33.8293302Z * [new branch] gh/swolchok/825/base -> origin/gh/swolchok/825/base 2025-09-07T07:46:33.8293516Z * [new branch] gh/swolchok/825/head -> origin/gh/swolchok/825/head 2025-09-07T07:46:33.8293735Z * [new branch] gh/swolchok/825/orig -> origin/gh/swolchok/825/orig 2025-09-07T07:46:33.8293970Z * [new branch] gh/swolchok/826/base -> origin/gh/swolchok/826/base 2025-09-07T07:46:33.8294188Z * [new branch] gh/swolchok/826/head -> origin/gh/swolchok/826/head 2025-09-07T07:46:33.8294420Z * [new branch] gh/swolchok/826/orig -> origin/gh/swolchok/826/orig 2025-09-07T07:46:33.8294638Z * [new branch] gh/swolchok/827/base -> origin/gh/swolchok/827/base 2025-09-07T07:46:33.8294854Z * [new branch] gh/swolchok/827/head -> origin/gh/swolchok/827/head 2025-09-07T07:46:33.8295084Z * [new branch] gh/swolchok/827/orig -> origin/gh/swolchok/827/orig 2025-09-07T07:46:33.8295305Z * [new branch] gh/swolchok/828/base -> origin/gh/swolchok/828/base 2025-09-07T07:46:33.8295539Z * [new branch] gh/swolchok/828/head -> origin/gh/swolchok/828/head 2025-09-07T07:46:33.8295757Z * [new branch] gh/swolchok/828/orig -> origin/gh/swolchok/828/orig 2025-09-07T07:46:33.8295985Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-09-07T07:46:33.8296202Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-09-07T07:46:33.8296419Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-09-07T07:46:33.8296644Z * [new branch] gh/swolchok/830/base -> origin/gh/swolchok/830/base 2025-09-07T07:46:33.8296861Z * [new branch] gh/swolchok/830/head -> origin/gh/swolchok/830/head 2025-09-07T07:46:33.8297089Z * [new branch] gh/swolchok/830/orig -> origin/gh/swolchok/830/orig 2025-09-07T07:46:33.8297396Z * [new branch] gh/swolchok/831/base -> origin/gh/swolchok/831/base 2025-09-07T07:46:33.8297724Z * [new branch] gh/swolchok/831/head -> origin/gh/swolchok/831/head 2025-09-07T07:46:33.8297957Z * [new branch] gh/swolchok/831/orig -> origin/gh/swolchok/831/orig 2025-09-07T07:46:33.8298175Z * [new branch] gh/swolchok/832/base -> origin/gh/swolchok/832/base 2025-09-07T07:46:33.8298408Z * [new branch] gh/swolchok/832/head -> origin/gh/swolchok/832/head 2025-09-07T07:46:33.8298623Z * [new branch] gh/swolchok/832/orig -> origin/gh/swolchok/832/orig 2025-09-07T07:46:33.8298857Z * [new branch] gh/syed-ahmed/3/base -> origin/gh/syed-ahmed/3/base 2025-09-07T07:46:33.8299079Z * [new branch] gh/syed-ahmed/3/head -> origin/gh/syed-ahmed/3/head 2025-09-07T07:46:33.8299295Z * [new branch] gh/syed-ahmed/3/orig -> origin/gh/syed-ahmed/3/orig 2025-09-07T07:46:33.8299530Z * [new branch] gh/syed-ahmed/4/base -> origin/gh/syed-ahmed/4/base 2025-09-07T07:46:33.8299750Z * [new branch] gh/syed-ahmed/4/head -> origin/gh/syed-ahmed/4/head 2025-09-07T07:46:33.8299980Z * [new branch] gh/syed-ahmed/4/orig -> origin/gh/syed-ahmed/4/orig 2025-09-07T07:46:33.8300197Z * [new branch] gh/syed-ahmed/5/base -> origin/gh/syed-ahmed/5/base 2025-09-07T07:46:33.8300425Z * [new branch] gh/syed-ahmed/5/head -> origin/gh/syed-ahmed/5/head 2025-09-07T07:46:33.8300640Z * [new branch] gh/syed-ahmed/5/orig -> origin/gh/syed-ahmed/5/orig 2025-09-07T07:46:33.8300854Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-09-07T07:46:33.8301079Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-09-07T07:46:33.8301378Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-09-07T07:46:33.8301601Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-09-07T07:46:33.8301813Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-09-07T07:46:33.8302020Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-09-07T07:46:33.8302237Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-09-07T07:46:33.8302442Z * [new branch] gh/tianyu-l/3/head -> origin/gh/tianyu-l/3/head 2025-09-07T07:46:33.8302661Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-09-07T07:46:33.8302868Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-09-07T07:46:33.8303089Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-09-07T07:46:33.8303297Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-09-07T07:46:33.8303556Z * [new branch] gh/tugsbayasgalan/1/base -> origin/gh/tugsbayasgalan/1/base 2025-09-07T07:46:33.8303821Z * [new branch] gh/tugsbayasgalan/1/head -> origin/gh/tugsbayasgalan/1/head 2025-09-07T07:46:33.8304069Z * [new branch] gh/tugsbayasgalan/1/orig -> origin/gh/tugsbayasgalan/1/orig 2025-09-07T07:46:33.8304340Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-09-07T07:46:33.8304593Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-09-07T07:46:33.8304847Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-09-07T07:46:33.8305113Z * [new branch] gh/tugsbayasgalan/11/base -> origin/gh/tugsbayasgalan/11/base 2025-09-07T07:46:33.8305368Z * [new branch] gh/tugsbayasgalan/11/head -> origin/gh/tugsbayasgalan/11/head 2025-09-07T07:46:33.8305633Z * [new branch] gh/tugsbayasgalan/11/orig -> origin/gh/tugsbayasgalan/11/orig 2025-09-07T07:46:33.8305981Z * [new branch] gh/tugsbayasgalan/12/base -> origin/gh/tugsbayasgalan/12/base 2025-09-07T07:46:33.8306248Z * [new branch] gh/tugsbayasgalan/12/head -> origin/gh/tugsbayasgalan/12/head 2025-09-07T07:46:33.8306498Z * [new branch] gh/tugsbayasgalan/12/orig -> origin/gh/tugsbayasgalan/12/orig 2025-09-07T07:46:33.8306750Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-09-07T07:46:33.8307018Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-09-07T07:46:33.8307269Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-09-07T07:46:33.8307530Z * [new branch] gh/tugsbayasgalan/14/base -> origin/gh/tugsbayasgalan/14/base 2025-09-07T07:46:33.8307783Z * [new branch] gh/tugsbayasgalan/14/head -> origin/gh/tugsbayasgalan/14/head 2025-09-07T07:46:33.8308049Z * [new branch] gh/tugsbayasgalan/14/orig -> origin/gh/tugsbayasgalan/14/orig 2025-09-07T07:46:33.8308300Z * [new branch] gh/tugsbayasgalan/15/base -> origin/gh/tugsbayasgalan/15/base 2025-09-07T07:46:33.8308549Z * [new branch] gh/tugsbayasgalan/15/head -> origin/gh/tugsbayasgalan/15/head 2025-09-07T07:46:33.8308800Z * [new branch] gh/tugsbayasgalan/15/orig -> origin/gh/tugsbayasgalan/15/orig 2025-09-07T07:46:33.8309052Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-09-07T07:46:33.8309313Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-09-07T07:46:33.8309561Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-09-07T07:46:33.8309811Z * [new branch] gh/tugsbayasgalan/3/base -> origin/gh/tugsbayasgalan/3/base 2025-09-07T07:46:33.8310160Z * [new branch] gh/tugsbayasgalan/3/head -> origin/gh/tugsbayasgalan/3/head 2025-09-07T07:46:33.8310409Z * [new branch] gh/tugsbayasgalan/3/orig -> origin/gh/tugsbayasgalan/3/orig 2025-09-07T07:46:33.8310673Z * [new branch] gh/tugsbayasgalan/4/base -> origin/gh/tugsbayasgalan/4/base 2025-09-07T07:46:33.8310920Z * [new branch] gh/tugsbayasgalan/4/head -> origin/gh/tugsbayasgalan/4/head 2025-09-07T07:46:33.8311164Z * [new branch] gh/tugsbayasgalan/4/orig -> origin/gh/tugsbayasgalan/4/orig 2025-09-07T07:46:33.8311421Z * [new branch] gh/tugsbayasgalan/5/base -> origin/gh/tugsbayasgalan/5/base 2025-09-07T07:46:33.8311665Z * [new branch] gh/tugsbayasgalan/5/head -> origin/gh/tugsbayasgalan/5/head 2025-09-07T07:46:33.8311926Z * [new branch] gh/tugsbayasgalan/5/orig -> origin/gh/tugsbayasgalan/5/orig 2025-09-07T07:46:33.8312175Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-09-07T07:46:33.8312423Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-09-07T07:46:33.8312685Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-09-07T07:46:33.8312931Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-09-07T07:46:33.8313191Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-09-07T07:46:33.8313436Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-09-07T07:46:33.8313694Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-09-07T07:46:33.8313940Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-09-07T07:46:33.8314190Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-09-07T07:46:33.8314449Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-09-07T07:46:33.8314795Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-09-07T07:46:33.8315056Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-09-07T07:46:33.8315257Z * [new branch] gh/v0i0/1/base -> origin/gh/v0i0/1/base 2025-09-07T07:46:33.8315461Z * [new branch] gh/v0i0/1/head -> origin/gh/v0i0/1/head 2025-09-07T07:46:33.8315652Z * [new branch] gh/v0i0/1/orig -> origin/gh/v0i0/1/orig 2025-09-07T07:46:33.8315840Z * [new branch] gh/v0i0/4/base -> origin/gh/v0i0/4/base 2025-09-07T07:46:33.8316040Z * [new branch] gh/v0i0/4/head -> origin/gh/v0i0/4/head 2025-09-07T07:46:33.8316230Z * [new branch] gh/v0i0/4/orig -> origin/gh/v0i0/4/orig 2025-09-07T07:46:33.8316431Z * [new branch] gh/v0i0/6/base -> origin/gh/v0i0/6/base 2025-09-07T07:46:33.8316624Z * [new branch] gh/v0i0/6/head -> origin/gh/v0i0/6/head 2025-09-07T07:46:33.8316813Z * [new branch] gh/v0i0/6/orig -> origin/gh/v0i0/6/orig 2025-09-07T07:46:33.8317016Z * [new branch] gh/v0i0/7/base -> origin/gh/v0i0/7/base 2025-09-07T07:46:33.8317204Z * [new branch] gh/v0i0/7/head -> origin/gh/v0i0/7/head 2025-09-07T07:46:33.8317407Z * [new branch] gh/v0i0/7/orig -> origin/gh/v0i0/7/orig 2025-09-07T07:46:33.8317596Z * [new branch] gh/v0i0/8/base -> origin/gh/v0i0/8/base 2025-09-07T07:46:33.8317798Z * [new branch] gh/v0i0/8/head -> origin/gh/v0i0/8/head 2025-09-07T07:46:33.8317987Z * [new branch] gh/v0i0/8/orig -> origin/gh/v0i0/8/orig 2025-09-07T07:46:33.8318268Z * [new branch] gh/v0i0/9/base -> origin/gh/v0i0/9/base 2025-09-07T07:46:33.8318473Z * [new branch] gh/v0i0/9/head -> origin/gh/v0i0/9/head 2025-09-07T07:46:33.8318662Z * [new branch] gh/v0i0/9/orig -> origin/gh/v0i0/9/orig 2025-09-07T07:46:33.8318872Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-09-07T07:46:33.8319070Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-09-07T07:46:33.8319264Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-09-07T07:46:33.8319470Z * [new branch] gh/vkuzo/4/base -> origin/gh/vkuzo/4/base 2025-09-07T07:46:33.8319664Z * [new branch] gh/vkuzo/4/head -> origin/gh/vkuzo/4/head 2025-09-07T07:46:33.8319870Z * [new branch] gh/vkuzo/4/orig -> origin/gh/vkuzo/4/orig 2025-09-07T07:46:33.8320067Z * [new branch] gh/vkuzo/5/base -> origin/gh/vkuzo/5/base 2025-09-07T07:46:33.8320277Z * [new branch] gh/vkuzo/5/head -> origin/gh/vkuzo/5/head 2025-09-07T07:46:33.8320471Z * [new branch] gh/vkuzo/5/orig -> origin/gh/vkuzo/5/orig 2025-09-07T07:46:33.8320665Z * [new branch] gh/vkuzo/6/base -> origin/gh/vkuzo/6/base 2025-09-07T07:46:33.8320873Z * [new branch] gh/vkuzo/6/head -> origin/gh/vkuzo/6/head 2025-09-07T07:46:33.8321065Z * [new branch] gh/vkuzo/6/orig -> origin/gh/vkuzo/6/orig 2025-09-07T07:46:33.8321272Z * [new branch] gh/vkuzo/7/base -> origin/gh/vkuzo/7/base 2025-09-07T07:46:33.8321466Z * [new branch] gh/vkuzo/7/head -> origin/gh/vkuzo/7/head 2025-09-07T07:46:33.8321658Z * [new branch] gh/vkuzo/7/orig -> origin/gh/vkuzo/7/orig 2025-09-07T07:46:33.8321900Z * [new branch] gh/wconstab/419/base -> origin/gh/wconstab/419/base 2025-09-07T07:46:33.8322205Z * [new branch] gh/wconstab/419/head -> origin/gh/wconstab/419/head 2025-09-07T07:46:33.8322438Z * [new branch] gh/wconstab/419/orig -> origin/gh/wconstab/419/orig 2025-09-07T07:46:33.8322656Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-09-07T07:46:33.8323013Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-09-07T07:46:33.8323237Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-09-07T07:46:33.8323455Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-09-07T07:46:33.8323686Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-09-07T07:46:33.8323907Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-09-07T07:46:33.8324143Z * [new branch] gh/wconstab/438/base -> origin/gh/wconstab/438/base 2025-09-07T07:46:33.8324369Z * [new branch] gh/wconstab/438/head -> origin/gh/wconstab/438/head 2025-09-07T07:46:33.8324593Z * [new branch] gh/wconstab/438/orig -> origin/gh/wconstab/438/orig 2025-09-07T07:46:33.8324824Z * [new branch] gh/wconstab/440/base -> origin/gh/wconstab/440/base 2025-09-07T07:46:33.8325040Z * [new branch] gh/wconstab/440/head -> origin/gh/wconstab/440/head 2025-09-07T07:46:33.8325272Z * [new branch] gh/wconstab/440/orig -> origin/gh/wconstab/440/orig 2025-09-07T07:46:33.8325487Z * [new branch] gh/wconstab/441/base -> origin/gh/wconstab/441/base 2025-09-07T07:46:33.8325716Z * [new branch] gh/wconstab/441/head -> origin/gh/wconstab/441/head 2025-09-07T07:46:33.8326057Z * [new branch] gh/wconstab/441/orig -> origin/gh/wconstab/441/orig 2025-09-07T07:46:33.8326272Z * [new branch] gh/wconstab/442/base -> origin/gh/wconstab/442/base 2025-09-07T07:46:33.8326507Z * [new branch] gh/wconstab/442/head -> origin/gh/wconstab/442/head 2025-09-07T07:46:33.8326724Z * [new branch] gh/wconstab/442/orig -> origin/gh/wconstab/442/orig 2025-09-07T07:46:33.8326958Z * [new branch] gh/wconstab/443/base -> origin/gh/wconstab/443/base 2025-09-07T07:46:33.8327173Z * [new branch] gh/wconstab/443/head -> origin/gh/wconstab/443/head 2025-09-07T07:46:33.8327405Z * [new branch] gh/wconstab/443/orig -> origin/gh/wconstab/443/orig 2025-09-07T07:46:33.8327623Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-09-07T07:46:33.8327836Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-09-07T07:46:33.8328071Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-09-07T07:46:33.8328292Z * [new branch] gh/wconstab/445/base -> origin/gh/wconstab/445/base 2025-09-07T07:46:33.8328523Z * [new branch] gh/wconstab/445/head -> origin/gh/wconstab/445/head 2025-09-07T07:46:33.8328740Z * [new branch] gh/wconstab/445/orig -> origin/gh/wconstab/445/orig 2025-09-07T07:46:33.8328956Z * [new branch] gh/wconstab/446/base -> origin/gh/wconstab/446/base 2025-09-07T07:46:33.8329187Z * [new branch] gh/wconstab/446/head -> origin/gh/wconstab/446/head 2025-09-07T07:46:33.8329409Z * [new branch] gh/wconstab/446/orig -> origin/gh/wconstab/446/orig 2025-09-07T07:46:33.8329641Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-09-07T07:46:33.8329856Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-09-07T07:46:33.8330092Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-09-07T07:46:33.8330414Z * [new branch] gh/weifengpy/27/base -> origin/gh/weifengpy/27/base 2025-09-07T07:46:33.8330638Z * [new branch] gh/weifengpy/27/head -> origin/gh/weifengpy/27/head 2025-09-07T07:46:33.8330876Z * [new branch] gh/weifengpy/27/orig -> origin/gh/weifengpy/27/orig 2025-09-07T07:46:33.8331099Z * [new branch] gh/weifengpy/30/base -> origin/gh/weifengpy/30/base 2025-09-07T07:46:33.8331333Z * [new branch] gh/weifengpy/30/head -> origin/gh/weifengpy/30/head 2025-09-07T07:46:33.8331555Z * [new branch] gh/weifengpy/30/orig -> origin/gh/weifengpy/30/orig 2025-09-07T07:46:33.8331813Z * [new branch] gh/williamwen42/196/base -> origin/gh/williamwen42/196/base 2025-09-07T07:46:33.8332055Z * [new branch] gh/williamwen42/196/head -> origin/gh/williamwen42/196/head 2025-09-07T07:46:33.8332300Z * [new branch] gh/williamwen42/196/orig -> origin/gh/williamwen42/196/orig 2025-09-07T07:46:33.8332556Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-09-07T07:46:33.8332798Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-09-07T07:46:33.8333053Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-09-07T07:46:33.8333292Z * [new branch] gh/williamwen42/258/base -> origin/gh/williamwen42/258/base 2025-09-07T07:46:33.8333532Z * [new branch] gh/williamwen42/258/head -> origin/gh/williamwen42/258/head 2025-09-07T07:46:33.8333783Z * [new branch] gh/williamwen42/258/orig -> origin/gh/williamwen42/258/orig 2025-09-07T07:46:33.8334024Z * [new branch] gh/williamwen42/266/base -> origin/gh/williamwen42/266/base 2025-09-07T07:46:33.8334365Z * [new branch] gh/williamwen42/266/head -> origin/gh/williamwen42/266/head 2025-09-07T07:46:33.8334606Z * [new branch] gh/williamwen42/266/orig -> origin/gh/williamwen42/266/orig 2025-09-07T07:46:33.8334862Z * [new branch] gh/williamwen42/267/base -> origin/gh/williamwen42/267/base 2025-09-07T07:46:33.8335102Z * [new branch] gh/williamwen42/267/head -> origin/gh/williamwen42/267/head 2025-09-07T07:46:33.8335342Z * [new branch] gh/williamwen42/267/orig -> origin/gh/williamwen42/267/orig 2025-09-07T07:46:33.8335594Z * [new branch] gh/williamwen42/270/base -> origin/gh/williamwen42/270/base 2025-09-07T07:46:33.8335832Z * [new branch] gh/williamwen42/270/head -> origin/gh/williamwen42/270/head 2025-09-07T07:46:33.8336082Z * [new branch] gh/williamwen42/270/orig -> origin/gh/williamwen42/270/orig 2025-09-07T07:46:33.8336321Z * [new branch] gh/williamwen42/271/base -> origin/gh/williamwen42/271/base 2025-09-07T07:46:33.8336576Z * [new branch] gh/williamwen42/271/head -> origin/gh/williamwen42/271/head 2025-09-07T07:46:33.8336818Z * [new branch] gh/williamwen42/271/orig -> origin/gh/williamwen42/271/orig 2025-09-07T07:46:33.8337056Z * [new branch] gh/williamwen42/272/base -> origin/gh/williamwen42/272/base 2025-09-07T07:46:33.8337378Z * [new branch] gh/williamwen42/272/head -> origin/gh/williamwen42/272/head 2025-09-07T07:46:33.8337628Z * [new branch] gh/williamwen42/272/orig -> origin/gh/williamwen42/272/orig 2025-09-07T07:46:33.8337883Z * [new branch] gh/williamwen42/274/base -> origin/gh/williamwen42/274/base 2025-09-07T07:46:33.8338123Z * [new branch] gh/williamwen42/274/head -> origin/gh/williamwen42/274/head 2025-09-07T07:46:33.8338364Z * [new branch] gh/williamwen42/274/orig -> origin/gh/williamwen42/274/orig 2025-09-07T07:46:33.8338623Z * [new branch] gh/williamwen42/275/base -> origin/gh/williamwen42/275/base 2025-09-07T07:46:33.8338962Z * [new branch] gh/williamwen42/275/head -> origin/gh/williamwen42/275/head 2025-09-07T07:46:33.8339222Z * [new branch] gh/williamwen42/276/base -> origin/gh/williamwen42/276/base 2025-09-07T07:46:33.8339463Z * [new branch] gh/williamwen42/276/head -> origin/gh/williamwen42/276/head 2025-09-07T07:46:33.8339719Z * [new branch] gh/williamwen42/276/orig -> origin/gh/williamwen42/276/orig 2025-09-07T07:46:33.8339960Z * [new branch] gh/williamwen42/277/base -> origin/gh/williamwen42/277/base 2025-09-07T07:46:33.8340199Z * [new branch] gh/williamwen42/277/head -> origin/gh/williamwen42/277/head 2025-09-07T07:46:33.8340455Z * [new branch] gh/williamwen42/277/orig -> origin/gh/williamwen42/277/orig 2025-09-07T07:46:33.8340702Z * [new branch] gh/williamwen42/278/base -> origin/gh/williamwen42/278/base 2025-09-07T07:46:33.8340955Z * [new branch] gh/williamwen42/278/head -> origin/gh/williamwen42/278/head 2025-09-07T07:46:33.8341198Z * [new branch] gh/williamwen42/278/orig -> origin/gh/williamwen42/278/orig 2025-09-07T07:46:33.8341451Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-09-07T07:46:33.8341693Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-09-07T07:46:33.8341933Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-09-07T07:46:33.8342187Z * [new branch] gh/williamwen42/280/base -> origin/gh/williamwen42/280/base 2025-09-07T07:46:33.8342425Z * [new branch] gh/williamwen42/280/head -> origin/gh/williamwen42/280/head 2025-09-07T07:46:33.8342679Z * [new branch] gh/williamwen42/280/orig -> origin/gh/williamwen42/280/orig 2025-09-07T07:46:33.8343089Z * [new branch] gh/williamwen42/281/base -> origin/gh/williamwen42/281/base 2025-09-07T07:46:33.8343332Z * [new branch] gh/williamwen42/281/head -> origin/gh/williamwen42/281/head 2025-09-07T07:46:33.8343587Z * [new branch] gh/williamwen42/281/orig -> origin/gh/williamwen42/281/orig 2025-09-07T07:46:33.8343827Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-09-07T07:46:33.8344080Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-09-07T07:46:33.8344316Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-09-07T07:46:33.8344568Z * [new branch] gh/williamwen42/283/base -> origin/gh/williamwen42/283/base 2025-09-07T07:46:33.8344808Z * [new branch] gh/williamwen42/283/head -> origin/gh/williamwen42/283/head 2025-09-07T07:46:33.8345050Z * [new branch] gh/williamwen42/283/orig -> origin/gh/williamwen42/283/orig 2025-09-07T07:46:33.8345306Z * [new branch] gh/williamwen42/284/base -> origin/gh/williamwen42/284/base 2025-09-07T07:46:33.8345549Z * [new branch] gh/williamwen42/284/head -> origin/gh/williamwen42/284/head 2025-09-07T07:46:33.8345803Z * [new branch] gh/williamwen42/284/orig -> origin/gh/williamwen42/284/orig 2025-09-07T07:46:33.8346040Z * [new branch] gh/williamwen42/285/base -> origin/gh/williamwen42/285/base 2025-09-07T07:46:33.8346291Z * [new branch] gh/williamwen42/285/head -> origin/gh/williamwen42/285/head 2025-09-07T07:46:33.8346531Z * [new branch] gh/williamwen42/285/orig -> origin/gh/williamwen42/285/orig 2025-09-07T07:46:33.8346771Z * [new branch] gh/williamwen42/286/base -> origin/gh/williamwen42/286/base 2025-09-07T07:46:33.8347020Z * [new branch] gh/williamwen42/286/head -> origin/gh/williamwen42/286/head 2025-09-07T07:46:33.8347261Z * [new branch] gh/williamwen42/286/orig -> origin/gh/williamwen42/286/orig 2025-09-07T07:46:33.8347619Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-09-07T07:46:33.8347860Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-09-07T07:46:33.8348097Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-09-07T07:46:33.8348347Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-09-07T07:46:33.8348587Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-09-07T07:46:33.8348839Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-09-07T07:46:33.8349077Z * [new branch] gh/williamwen42/289/base -> origin/gh/williamwen42/289/base 2025-09-07T07:46:33.8349332Z * [new branch] gh/williamwen42/289/head -> origin/gh/williamwen42/289/head 2025-09-07T07:46:33.8349574Z * [new branch] gh/williamwen42/289/orig -> origin/gh/williamwen42/289/orig 2025-09-07T07:46:33.8349775Z * [new branch] gh/wychi/1/base -> origin/gh/wychi/1/base 2025-09-07T07:46:33.8349986Z * [new branch] gh/wychi/1/head -> origin/gh/wychi/1/head 2025-09-07T07:46:33.8350182Z * [new branch] gh/wychi/1/orig -> origin/gh/wychi/1/orig 2025-09-07T07:46:33.8350401Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-09-07T07:46:33.8350605Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-09-07T07:46:33.8350821Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-09-07T07:46:33.8351024Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-09-07T07:46:33.8351309Z * [new branch] gh/xmfan/18/base -> origin/gh/xmfan/18/base 2025-09-07T07:46:33.8351520Z * [new branch] gh/xmfan/18/head -> origin/gh/xmfan/18/head 2025-09-07T07:46:33.8351725Z * [new branch] gh/xmfan/229/base -> origin/gh/xmfan/229/base 2025-09-07T07:46:33.8351940Z * [new branch] gh/xmfan/229/head -> origin/gh/xmfan/229/head 2025-09-07T07:46:33.8352144Z * [new branch] gh/xmfan/229/orig -> origin/gh/xmfan/229/orig 2025-09-07T07:46:33.8352345Z * [new branch] gh/xmfan/237/base -> origin/gh/xmfan/237/base 2025-09-07T07:46:33.8352558Z * [new branch] gh/xmfan/237/head -> origin/gh/xmfan/237/head 2025-09-07T07:46:33.8352761Z * [new branch] gh/xmfan/237/orig -> origin/gh/xmfan/237/orig 2025-09-07T07:46:33.8352976Z * [new branch] gh/xmfan/244/base -> origin/gh/xmfan/244/base 2025-09-07T07:46:33.8353180Z * [new branch] gh/xmfan/244/head -> origin/gh/xmfan/244/head 2025-09-07T07:46:33.8353395Z * [new branch] gh/xmfan/244/orig -> origin/gh/xmfan/244/orig 2025-09-07T07:46:33.8353599Z * [new branch] gh/xmfan/246/base -> origin/gh/xmfan/246/base 2025-09-07T07:46:33.8353803Z * [new branch] gh/xmfan/246/head -> origin/gh/xmfan/246/head 2025-09-07T07:46:33.8354020Z * [new branch] gh/xmfan/246/orig -> origin/gh/xmfan/246/orig 2025-09-07T07:46:33.8354224Z * [new branch] gh/xmfan/253/base -> origin/gh/xmfan/253/base 2025-09-07T07:46:33.8354436Z * [new branch] gh/xmfan/253/head -> origin/gh/xmfan/253/head 2025-09-07T07:46:33.8354636Z * [new branch] gh/xmfan/253/orig -> origin/gh/xmfan/253/orig 2025-09-07T07:46:33.8354841Z * [new branch] gh/xmfan/254/base -> origin/gh/xmfan/254/base 2025-09-07T07:46:33.8355057Z * [new branch] gh/xmfan/254/head -> origin/gh/xmfan/254/head 2025-09-07T07:46:33.8355259Z * [new branch] gh/xmfan/254/orig -> origin/gh/xmfan/254/orig 2025-09-07T07:46:33.8355563Z * [new branch] gh/xmfan/260/base -> origin/gh/xmfan/260/base 2025-09-07T07:46:33.8355770Z * [new branch] gh/xmfan/260/head -> origin/gh/xmfan/260/head 2025-09-07T07:46:33.8355985Z * [new branch] gh/xmfan/260/orig -> origin/gh/xmfan/260/orig 2025-09-07T07:46:33.8356188Z * [new branch] gh/xmfan/262/base -> origin/gh/xmfan/262/base 2025-09-07T07:46:33.8356391Z * [new branch] gh/xmfan/262/head -> origin/gh/xmfan/262/head 2025-09-07T07:46:33.8356605Z * [new branch] gh/xmfan/262/orig -> origin/gh/xmfan/262/orig 2025-09-07T07:46:33.8356809Z * [new branch] gh/xmfan/263/base -> origin/gh/xmfan/263/base 2025-09-07T07:46:33.8357028Z * [new branch] gh/xmfan/263/head -> origin/gh/xmfan/263/head 2025-09-07T07:46:33.8357230Z * [new branch] gh/xmfan/263/orig -> origin/gh/xmfan/263/orig 2025-09-07T07:46:33.8357434Z * [new branch] gh/xmfan/264/base -> origin/gh/xmfan/264/base 2025-09-07T07:46:33.8357652Z * [new branch] gh/xmfan/264/head -> origin/gh/xmfan/264/head 2025-09-07T07:46:33.8357854Z * [new branch] gh/xmfan/264/orig -> origin/gh/xmfan/264/orig 2025-09-07T07:46:33.8358069Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-09-07T07:46:33.8358273Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-09-07T07:46:33.8358488Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-09-07T07:46:33.8358690Z * [new branch] gh/xmfan/276/base -> origin/gh/xmfan/276/base 2025-09-07T07:46:33.8358984Z * [new branch] gh/xmfan/276/head -> origin/gh/xmfan/276/head 2025-09-07T07:46:33.8359200Z * [new branch] gh/xmfan/276/orig -> origin/gh/xmfan/276/orig 2025-09-07T07:46:33.8359405Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-09-07T07:46:33.8359621Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-09-07T07:46:33.8359823Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-09-07T07:46:33.8360029Z * [new branch] gh/xmfan/278/base -> origin/gh/xmfan/278/base 2025-09-07T07:46:33.8360246Z * [new branch] gh/xmfan/278/head -> origin/gh/xmfan/278/head 2025-09-07T07:46:33.8360453Z * [new branch] gh/xmfan/278/orig -> origin/gh/xmfan/278/orig 2025-09-07T07:46:33.8360669Z * [new branch] gh/xmfan/279/base -> origin/gh/xmfan/279/base 2025-09-07T07:46:33.8360876Z * [new branch] gh/xmfan/279/head -> origin/gh/xmfan/279/head 2025-09-07T07:46:33.8361092Z * [new branch] gh/xmfan/279/orig -> origin/gh/xmfan/279/orig 2025-09-07T07:46:33.8361296Z * [new branch] gh/xmfan/280/base -> origin/gh/xmfan/280/base 2025-09-07T07:46:33.8361500Z * [new branch] gh/xmfan/280/head -> origin/gh/xmfan/280/head 2025-09-07T07:46:33.8361717Z * [new branch] gh/xmfan/280/orig -> origin/gh/xmfan/280/orig 2025-09-07T07:46:33.8361917Z * [new branch] gh/xmfan/281/base -> origin/gh/xmfan/281/base 2025-09-07T07:46:33.8362131Z * [new branch] gh/xmfan/281/head -> origin/gh/xmfan/281/head 2025-09-07T07:46:33.8362333Z * [new branch] gh/xmfan/281/orig -> origin/gh/xmfan/281/orig 2025-09-07T07:46:33.8362536Z * [new branch] gh/xmfan/282/base -> origin/gh/xmfan/282/base 2025-09-07T07:46:33.8362756Z * [new branch] gh/xmfan/282/head -> origin/gh/xmfan/282/head 2025-09-07T07:46:33.8363083Z * [new branch] gh/xmfan/283/base -> origin/gh/xmfan/283/base 2025-09-07T07:46:33.8363404Z * [new branch] gh/xmfan/283/head -> origin/gh/xmfan/283/head 2025-09-07T07:46:33.8363607Z * [new branch] gh/xmfan/283/orig -> origin/gh/xmfan/283/orig 2025-09-07T07:46:33.8363860Z * [new branch] gh/xuanzhang816/14/base -> origin/gh/xuanzhang816/14/base 2025-09-07T07:46:33.8364097Z * [new branch] gh/xuanzhang816/14/head -> origin/gh/xuanzhang816/14/head 2025-09-07T07:46:33.8364334Z * [new branch] gh/xuanzhang816/14/orig -> origin/gh/xuanzhang816/14/orig 2025-09-07T07:46:33.8364583Z * [new branch] gh/xuanzhang816/19/base -> origin/gh/xuanzhang816/19/base 2025-09-07T07:46:33.8364817Z * [new branch] gh/xuanzhang816/19/head -> origin/gh/xuanzhang816/19/head 2025-09-07T07:46:33.8365071Z * [new branch] gh/xuanzhang816/19/orig -> origin/gh/xuanzhang816/19/orig 2025-09-07T07:46:33.8365306Z * [new branch] gh/xuanzhang816/22/base -> origin/gh/xuanzhang816/22/base 2025-09-07T07:46:33.8365558Z * [new branch] gh/xuanzhang816/22/head -> origin/gh/xuanzhang816/22/head 2025-09-07T07:46:33.8365791Z * [new branch] gh/xuanzhang816/22/orig -> origin/gh/xuanzhang816/22/orig 2025-09-07T07:46:33.8366027Z * [new branch] gh/xuanzhang816/23/base -> origin/gh/xuanzhang816/23/base 2025-09-07T07:46:33.8366272Z * [new branch] gh/xuanzhang816/23/head -> origin/gh/xuanzhang816/23/head 2025-09-07T07:46:33.8366503Z * [new branch] gh/xuanzhang816/23/orig -> origin/gh/xuanzhang816/23/orig 2025-09-07T07:46:33.8366747Z * [new branch] gh/xuanzhang816/24/base -> origin/gh/xuanzhang816/24/base 2025-09-07T07:46:33.8367076Z * [new branch] gh/xuanzhang816/24/head -> origin/gh/xuanzhang816/24/head 2025-09-07T07:46:33.8367309Z * [new branch] gh/xuanzhang816/24/orig -> origin/gh/xuanzhang816/24/orig 2025-09-07T07:46:33.8367557Z * [new branch] gh/xuanzhang816/25/base -> origin/gh/xuanzhang816/25/base 2025-09-07T07:46:33.8367791Z * [new branch] gh/xuanzhang816/25/head -> origin/gh/xuanzhang816/25/head 2025-09-07T07:46:33.8368034Z * [new branch] gh/xuanzhang816/25/orig -> origin/gh/xuanzhang816/25/orig 2025-09-07T07:46:33.8368268Z * [new branch] gh/xuanzhang816/26/base -> origin/gh/xuanzhang816/26/base 2025-09-07T07:46:33.8368516Z * [new branch] gh/xuanzhang816/26/head -> origin/gh/xuanzhang816/26/head 2025-09-07T07:46:33.8368749Z * [new branch] gh/xuanzhang816/26/orig -> origin/gh/xuanzhang816/26/orig 2025-09-07T07:46:33.8368972Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-09-07T07:46:33.8369208Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-09-07T07:46:33.8369425Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-09-07T07:46:33.8369657Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-09-07T07:46:33.8369872Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-09-07T07:46:33.8370101Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-09-07T07:46:33.8370318Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-09-07T07:46:33.8370531Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-09-07T07:46:33.8370755Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-09-07T07:46:33.8370968Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-09-07T07:46:33.8371200Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-09-07T07:46:33.8371508Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-09-07T07:46:33.8371725Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-09-07T07:46:33.8371950Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-09-07T07:46:33.8372164Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-09-07T07:46:33.8372393Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-09-07T07:46:33.8372606Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-09-07T07:46:33.8372836Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-09-07T07:46:33.8373054Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-09-07T07:46:33.8373267Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-09-07T07:46:33.8373500Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-09-07T07:46:33.8373716Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-09-07T07:46:33.8373946Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-09-07T07:46:33.8374157Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-09-07T07:46:33.8374375Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-09-07T07:46:33.8374601Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-09-07T07:46:33.8374816Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-09-07T07:46:33.8375148Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-09-07T07:46:33.8375361Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-09-07T07:46:33.8375575Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-09-07T07:46:33.8375800Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-09-07T07:46:33.8376011Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-09-07T07:46:33.8376234Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-09-07T07:46:33.8376453Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-09-07T07:46:33.8376677Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-09-07T07:46:33.8376890Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-09-07T07:46:33.8377104Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-09-07T07:46:33.8377412Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-09-07T07:46:33.8377630Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-09-07T07:46:33.8377855Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-09-07T07:46:33.8378068Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-09-07T07:46:33.8378290Z * [new branch] gh/yanbing-j/36/base -> origin/gh/yanbing-j/36/base 2025-09-07T07:46:33.8378502Z * [new branch] gh/yanbing-j/36/head -> origin/gh/yanbing-j/36/head 2025-09-07T07:46:33.8378716Z * [new branch] gh/yanbing-j/36/orig -> origin/gh/yanbing-j/36/orig 2025-09-07T07:46:33.8378936Z * [new branch] gh/yanbing-j/37/base -> origin/gh/yanbing-j/37/base 2025-09-07T07:46:33.8379153Z * [new branch] gh/yanbing-j/37/head -> origin/gh/yanbing-j/37/head 2025-09-07T07:46:33.8379456Z * [new branch] gh/yanbing-j/37/orig -> origin/gh/yanbing-j/37/orig 2025-09-07T07:46:33.8379671Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-09-07T07:46:33.8379884Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-09-07T07:46:33.8380102Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-09-07T07:46:33.8380312Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-09-07T07:46:33.8380532Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-09-07T07:46:33.8380746Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-09-07T07:46:33.8380970Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-09-07T07:46:33.8381182Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-09-07T07:46:33.8381397Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-09-07T07:46:33.8381617Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-09-07T07:46:33.8381829Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-09-07T07:46:33.8382047Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-09-07T07:46:33.8382258Z * [new branch] gh/yangw-dev/16/base -> origin/gh/yangw-dev/16/base 2025-09-07T07:46:33.8382467Z * [new branch] gh/yangw-dev/16/head -> origin/gh/yangw-dev/16/head 2025-09-07T07:46:33.8382689Z * [new branch] gh/yangw-dev/16/orig -> origin/gh/yangw-dev/16/orig 2025-09-07T07:46:33.8382995Z * [new branch] gh/yangw-dev/17/base -> origin/gh/yangw-dev/17/base 2025-09-07T07:46:33.8383217Z * [new branch] gh/yangw-dev/17/head -> origin/gh/yangw-dev/17/head 2025-09-07T07:46:33.8383432Z * [new branch] gh/yangw-dev/17/orig -> origin/gh/yangw-dev/17/orig 2025-09-07T07:46:33.8383654Z * [new branch] gh/yangw-dev/18/base -> origin/gh/yangw-dev/18/base 2025-09-07T07:46:33.8383862Z * [new branch] gh/yangw-dev/18/head -> origin/gh/yangw-dev/18/head 2025-09-07T07:46:33.8384073Z * [new branch] gh/yangw-dev/18/orig -> origin/gh/yangw-dev/18/orig 2025-09-07T07:46:33.8384294Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-09-07T07:46:33.8384503Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-09-07T07:46:33.8384723Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-09-07T07:46:33.8384936Z * [new branch] gh/yangw-dev/20/base -> origin/gh/yangw-dev/20/base 2025-09-07T07:46:33.8385163Z * [new branch] gh/yangw-dev/20/head -> origin/gh/yangw-dev/20/head 2025-09-07T07:46:33.8385370Z * [new branch] gh/yangw-dev/20/orig -> origin/gh/yangw-dev/20/orig 2025-09-07T07:46:33.8385581Z * [new branch] gh/yangw-dev/21/base -> origin/gh/yangw-dev/21/base 2025-09-07T07:46:33.8385801Z * [new branch] gh/yangw-dev/21/head -> origin/gh/yangw-dev/21/head 2025-09-07T07:46:33.8386010Z * [new branch] gh/yangw-dev/21/orig -> origin/gh/yangw-dev/21/orig 2025-09-07T07:46:33.8386231Z * [new branch] gh/yangw-dev/22/base -> origin/gh/yangw-dev/22/base 2025-09-07T07:46:33.8386439Z * [new branch] gh/yangw-dev/22/head -> origin/gh/yangw-dev/22/head 2025-09-07T07:46:33.8386648Z * [new branch] gh/yangw-dev/22/orig -> origin/gh/yangw-dev/22/orig 2025-09-07T07:46:33.8386868Z * [new branch] gh/yangw-dev/23/base -> origin/gh/yangw-dev/23/base 2025-09-07T07:46:33.8387174Z * [new branch] gh/yangw-dev/23/head -> origin/gh/yangw-dev/23/head 2025-09-07T07:46:33.8387399Z * [new branch] gh/yangw-dev/23/orig -> origin/gh/yangw-dev/23/orig 2025-09-07T07:46:33.8387605Z * [new branch] gh/yangw-dev/24/base -> origin/gh/yangw-dev/24/base 2025-09-07T07:46:33.8387826Z * [new branch] gh/yangw-dev/24/head -> origin/gh/yangw-dev/24/head 2025-09-07T07:46:33.8388037Z * [new branch] gh/yangw-dev/24/orig -> origin/gh/yangw-dev/24/orig 2025-09-07T07:46:33.8388246Z * [new branch] gh/yangw-dev/25/base -> origin/gh/yangw-dev/25/base 2025-09-07T07:46:33.8388468Z * [new branch] gh/yangw-dev/25/head -> origin/gh/yangw-dev/25/head 2025-09-07T07:46:33.8388677Z * [new branch] gh/yangw-dev/25/orig -> origin/gh/yangw-dev/25/orig 2025-09-07T07:46:33.8388896Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-09-07T07:46:33.8389111Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-09-07T07:46:33.8389321Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-09-07T07:46:33.8389542Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-09-07T07:46:33.8389752Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-09-07T07:46:33.8389977Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-09-07T07:46:33.8390180Z * [new branch] gh/ydwu4/233/base -> origin/gh/ydwu4/233/base 2025-09-07T07:46:33.8390391Z * [new branch] gh/ydwu4/233/head -> origin/gh/ydwu4/233/head 2025-09-07T07:46:33.8390679Z * [new branch] gh/ydwu4/233/orig -> origin/gh/ydwu4/233/orig 2025-09-07T07:46:33.8390873Z * [new branch] gh/ydwu4/246/base -> origin/gh/ydwu4/246/base 2025-09-07T07:46:33.8391084Z * [new branch] gh/ydwu4/246/head -> origin/gh/ydwu4/246/head 2025-09-07T07:46:33.8391284Z * [new branch] gh/ydwu4/246/orig -> origin/gh/ydwu4/246/orig 2025-09-07T07:46:33.8391486Z * [new branch] gh/ydwu4/253/base -> origin/gh/ydwu4/253/base 2025-09-07T07:46:33.8391684Z * [new branch] gh/ydwu4/253/head -> origin/gh/ydwu4/253/head 2025-09-07T07:46:33.8391882Z * [new branch] gh/ydwu4/253/orig -> origin/gh/ydwu4/253/orig 2025-09-07T07:46:33.8392087Z * [new branch] gh/ydwu4/255/base -> origin/gh/ydwu4/255/base 2025-09-07T07:46:33.8392285Z * [new branch] gh/ydwu4/255/head -> origin/gh/ydwu4/255/head 2025-09-07T07:46:33.8392491Z * [new branch] gh/ydwu4/255/orig -> origin/gh/ydwu4/255/orig 2025-09-07T07:46:33.8392688Z * [new branch] gh/ydwu4/259/base -> origin/gh/ydwu4/259/base 2025-09-07T07:46:33.8392893Z * [new branch] gh/ydwu4/259/head -> origin/gh/ydwu4/259/head 2025-09-07T07:46:33.8393093Z * [new branch] gh/ydwu4/259/orig -> origin/gh/ydwu4/259/orig 2025-09-07T07:46:33.8393291Z * [new branch] gh/ydwu4/262/base -> origin/gh/ydwu4/262/base 2025-09-07T07:46:33.8393498Z * [new branch] gh/ydwu4/262/head -> origin/gh/ydwu4/262/head 2025-09-07T07:46:33.8393697Z * [new branch] gh/ydwu4/262/orig -> origin/gh/ydwu4/262/orig 2025-09-07T07:46:33.8393900Z * [new branch] gh/ydwu4/263/base -> origin/gh/ydwu4/263/base 2025-09-07T07:46:33.8394097Z * [new branch] gh/ydwu4/263/head -> origin/gh/ydwu4/263/head 2025-09-07T07:46:33.8394306Z * [new branch] gh/ydwu4/263/orig -> origin/gh/ydwu4/263/orig 2025-09-07T07:46:33.8394504Z * [new branch] gh/ydwu4/269/base -> origin/gh/ydwu4/269/base 2025-09-07T07:46:33.8394790Z * [new branch] gh/ydwu4/269/head -> origin/gh/ydwu4/269/head 2025-09-07T07:46:33.8394997Z * [new branch] gh/ydwu4/269/orig -> origin/gh/ydwu4/269/orig 2025-09-07T07:46:33.8395195Z * [new branch] gh/ydwu4/270/base -> origin/gh/ydwu4/270/base 2025-09-07T07:46:33.8395401Z * [new branch] gh/ydwu4/270/head -> origin/gh/ydwu4/270/head 2025-09-07T07:46:33.8395599Z * [new branch] gh/ydwu4/270/orig -> origin/gh/ydwu4/270/orig 2025-09-07T07:46:33.8395800Z * [new branch] gh/ydwu4/272/base -> origin/gh/ydwu4/272/base 2025-09-07T07:46:33.8396006Z * [new branch] gh/ydwu4/272/head -> origin/gh/ydwu4/272/head 2025-09-07T07:46:33.8396209Z * [new branch] gh/ydwu4/272/orig -> origin/gh/ydwu4/272/orig 2025-09-07T07:46:33.8396419Z * [new branch] gh/ydwu4/275/base -> origin/gh/ydwu4/275/base 2025-09-07T07:46:33.8396620Z * [new branch] gh/ydwu4/275/head -> origin/gh/ydwu4/275/head 2025-09-07T07:46:33.8396837Z * [new branch] gh/ydwu4/275/orig -> origin/gh/ydwu4/275/orig 2025-09-07T07:46:33.8397034Z * [new branch] gh/ydwu4/276/base -> origin/gh/ydwu4/276/base 2025-09-07T07:46:33.8397231Z * [new branch] gh/ydwu4/276/head -> origin/gh/ydwu4/276/head 2025-09-07T07:46:33.8397440Z * [new branch] gh/ydwu4/276/orig -> origin/gh/ydwu4/276/orig 2025-09-07T07:46:33.8397638Z * [new branch] gh/ydwu4/279/base -> origin/gh/ydwu4/279/base 2025-09-07T07:46:33.8397847Z * [new branch] gh/ydwu4/279/head -> origin/gh/ydwu4/279/head 2025-09-07T07:46:33.8398154Z * [new branch] gh/ydwu4/279/orig -> origin/gh/ydwu4/279/orig 2025-09-07T07:46:33.8398357Z * [new branch] gh/ydwu4/283/base -> origin/gh/ydwu4/283/base 2025-09-07T07:46:33.8398569Z * [new branch] gh/ydwu4/283/head -> origin/gh/ydwu4/283/head 2025-09-07T07:46:33.8398770Z * [new branch] gh/ydwu4/283/orig -> origin/gh/ydwu4/283/orig 2025-09-07T07:46:33.8398983Z * [new branch] gh/ydwu4/289/base -> origin/gh/ydwu4/289/base 2025-09-07T07:46:33.8399181Z * [new branch] gh/ydwu4/289/head -> origin/gh/ydwu4/289/head 2025-09-07T07:46:33.8399395Z * [new branch] gh/ydwu4/289/orig -> origin/gh/ydwu4/289/orig 2025-09-07T07:46:33.8399594Z * [new branch] gh/ydwu4/290/base -> origin/gh/ydwu4/290/base 2025-09-07T07:46:33.8399792Z * [new branch] gh/ydwu4/290/head -> origin/gh/ydwu4/290/head 2025-09-07T07:46:33.8400007Z * [new branch] gh/ydwu4/290/orig -> origin/gh/ydwu4/290/orig 2025-09-07T07:46:33.8400207Z * [new branch] gh/ydwu4/291/base -> origin/gh/ydwu4/291/base 2025-09-07T07:46:33.8400423Z * [new branch] gh/ydwu4/291/head -> origin/gh/ydwu4/291/head 2025-09-07T07:46:33.8400616Z * [new branch] gh/ydwu4/291/orig -> origin/gh/ydwu4/291/orig 2025-09-07T07:46:33.8400812Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-09-07T07:46:33.8401027Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-09-07T07:46:33.8401224Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-09-07T07:46:33.8401434Z * [new branch] gh/ydwu4/293/base -> origin/gh/ydwu4/293/base 2025-09-07T07:46:33.8401629Z * [new branch] gh/ydwu4/293/head -> origin/gh/ydwu4/293/head 2025-09-07T07:46:33.8401842Z * [new branch] gh/ydwu4/293/orig -> origin/gh/ydwu4/293/orig 2025-09-07T07:46:33.8402042Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-09-07T07:46:33.8402323Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-09-07T07:46:33.8402536Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-09-07T07:46:33.8402736Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-09-07T07:46:33.8403071Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-09-07T07:46:33.8403274Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-09-07T07:46:33.8403473Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-09-07T07:46:33.8403685Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-09-07T07:46:33.8403888Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-09-07T07:46:33.8404094Z * [new branch] gh/ydwu4/300/base -> origin/gh/ydwu4/300/base 2025-09-07T07:46:33.8404296Z * [new branch] gh/ydwu4/300/head -> origin/gh/ydwu4/300/head 2025-09-07T07:46:33.8404503Z * [new branch] gh/ydwu4/300/orig -> origin/gh/ydwu4/300/orig 2025-09-07T07:46:33.8404702Z * [new branch] gh/ydwu4/301/base -> origin/gh/ydwu4/301/base 2025-09-07T07:46:33.8404897Z * [new branch] gh/ydwu4/301/head -> origin/gh/ydwu4/301/head 2025-09-07T07:46:33.8405108Z * [new branch] gh/ydwu4/301/orig -> origin/gh/ydwu4/301/orig 2025-09-07T07:46:33.8405310Z * [new branch] gh/ydwu4/302/base -> origin/gh/ydwu4/302/base 2025-09-07T07:46:33.8405516Z * [new branch] gh/ydwu4/302/head -> origin/gh/ydwu4/302/head 2025-09-07T07:46:33.8405836Z * [new branch] gh/ydwu4/302/orig -> origin/gh/ydwu4/302/orig 2025-09-07T07:46:33.8406035Z * [new branch] gh/ydwu4/303/base -> origin/gh/ydwu4/303/base 2025-09-07T07:46:33.8406247Z * [new branch] gh/ydwu4/303/head -> origin/gh/ydwu4/303/head 2025-09-07T07:46:33.8406446Z * [new branch] gh/ydwu4/303/orig -> origin/gh/ydwu4/303/orig 2025-09-07T07:46:33.8406656Z * [new branch] gh/ydwu4/304/base -> origin/gh/ydwu4/304/base 2025-09-07T07:46:33.8406858Z * [new branch] gh/ydwu4/304/head -> origin/gh/ydwu4/304/head 2025-09-07T07:46:33.8407063Z * [new branch] gh/ydwu4/304/orig -> origin/gh/ydwu4/304/orig 2025-09-07T07:46:33.8407263Z * [new branch] gh/ydwu4/305/base -> origin/gh/ydwu4/305/base 2025-09-07T07:46:33.8407464Z * [new branch] gh/ydwu4/305/head -> origin/gh/ydwu4/305/head 2025-09-07T07:46:33.8407676Z * [new branch] gh/ydwu4/305/orig -> origin/gh/ydwu4/305/orig 2025-09-07T07:46:33.8407877Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-09-07T07:46:33.8408085Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-09-07T07:46:33.8408288Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-09-07T07:46:33.8408487Z * [new branch] gh/ydwu4/307/base -> origin/gh/ydwu4/307/base 2025-09-07T07:46:33.8408692Z * [new branch] gh/ydwu4/307/head -> origin/gh/ydwu4/307/head 2025-09-07T07:46:33.8408894Z * [new branch] gh/ydwu4/307/orig -> origin/gh/ydwu4/307/orig 2025-09-07T07:46:33.8409098Z * [new branch] gh/ydwu4/308/base -> origin/gh/ydwu4/308/base 2025-09-07T07:46:33.8409297Z * [new branch] gh/ydwu4/308/head -> origin/gh/ydwu4/308/head 2025-09-07T07:46:33.8409506Z * [new branch] gh/ydwu4/308/orig -> origin/gh/ydwu4/308/orig 2025-09-07T07:46:33.8409703Z * [new branch] gh/ydwu4/309/base -> origin/gh/ydwu4/309/base 2025-09-07T07:46:33.8410019Z * [new branch] gh/ydwu4/309/head -> origin/gh/ydwu4/309/head 2025-09-07T07:46:33.8410227Z * [new branch] gh/ydwu4/309/orig -> origin/gh/ydwu4/309/orig 2025-09-07T07:46:33.8410427Z * [new branch] gh/ydwu4/310/base -> origin/gh/ydwu4/310/base 2025-09-07T07:46:33.8410634Z * [new branch] gh/ydwu4/310/head -> origin/gh/ydwu4/310/head 2025-09-07T07:46:33.8410834Z * [new branch] gh/ydwu4/310/orig -> origin/gh/ydwu4/310/orig 2025-09-07T07:46:33.8411036Z * [new branch] gh/ydwu4/311/base -> origin/gh/ydwu4/311/base 2025-09-07T07:46:33.8411242Z * [new branch] gh/ydwu4/311/head -> origin/gh/ydwu4/311/head 2025-09-07T07:46:33.8411447Z * [new branch] gh/ydwu4/311/orig -> origin/gh/ydwu4/311/orig 2025-09-07T07:46:33.8411655Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-09-07T07:46:33.8411855Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-09-07T07:46:33.8412060Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-09-07T07:46:33.8412257Z * [new branch] gh/ydwu4/313/base -> origin/gh/ydwu4/313/base 2025-09-07T07:46:33.8412455Z * [new branch] gh/ydwu4/313/head -> origin/gh/ydwu4/313/head 2025-09-07T07:46:33.8412659Z * [new branch] gh/ydwu4/313/orig -> origin/gh/ydwu4/313/orig 2025-09-07T07:46:33.8412858Z * [new branch] gh/ydwu4/314/base -> origin/gh/ydwu4/314/base 2025-09-07T07:46:33.8413069Z * [new branch] gh/ydwu4/314/head -> origin/gh/ydwu4/314/head 2025-09-07T07:46:33.8413361Z * [new branch] gh/ydwu4/314/orig -> origin/gh/ydwu4/314/orig 2025-09-07T07:46:33.8413573Z * [new branch] gh/ydwu4/315/base -> origin/gh/ydwu4/315/base 2025-09-07T07:46:33.8413771Z * [new branch] gh/ydwu4/315/head -> origin/gh/ydwu4/315/head 2025-09-07T07:46:33.8413970Z * [new branch] gh/ydwu4/315/orig -> origin/gh/ydwu4/315/orig 2025-09-07T07:46:33.8414179Z * [new branch] gh/ydwu4/316/base -> origin/gh/ydwu4/316/base 2025-09-07T07:46:33.8414377Z * [new branch] gh/ydwu4/316/head -> origin/gh/ydwu4/316/head 2025-09-07T07:46:33.8414588Z * [new branch] gh/ydwu4/316/orig -> origin/gh/ydwu4/316/orig 2025-09-07T07:46:33.8414791Z * [new branch] gh/ydwu4/317/base -> origin/gh/ydwu4/317/base 2025-09-07T07:46:33.8414989Z * [new branch] gh/ydwu4/317/head -> origin/gh/ydwu4/317/head 2025-09-07T07:46:33.8415203Z * [new branch] gh/ydwu4/317/orig -> origin/gh/ydwu4/317/orig 2025-09-07T07:46:33.8415409Z * [new branch] gh/ydwu4/318/base -> origin/gh/ydwu4/318/base 2025-09-07T07:46:33.8415620Z * [new branch] gh/ydwu4/318/head -> origin/gh/ydwu4/318/head 2025-09-07T07:46:33.8415818Z * [new branch] gh/ydwu4/318/orig -> origin/gh/ydwu4/318/orig 2025-09-07T07:46:33.8416029Z * [new branch] gh/ydwu4/319/base -> origin/gh/ydwu4/319/base 2025-09-07T07:46:33.8416227Z * [new branch] gh/ydwu4/319/head -> origin/gh/ydwu4/319/head 2025-09-07T07:46:33.8416425Z * [new branch] gh/ydwu4/319/orig -> origin/gh/ydwu4/319/orig 2025-09-07T07:46:33.8416629Z * [new branch] gh/ydwu4/320/base -> origin/gh/ydwu4/320/base 2025-09-07T07:46:33.8416826Z * [new branch] gh/ydwu4/320/head -> origin/gh/ydwu4/320/head 2025-09-07T07:46:33.8417037Z * [new branch] gh/ydwu4/320/orig -> origin/gh/ydwu4/320/orig 2025-09-07T07:46:33.8417402Z * [new branch] gh/ydwu4/321/base -> origin/gh/ydwu4/321/base 2025-09-07T07:46:33.8417608Z * [new branch] gh/ydwu4/321/head -> origin/gh/ydwu4/321/head 2025-09-07T07:46:33.8417819Z * [new branch] gh/ydwu4/321/orig -> origin/gh/ydwu4/321/orig 2025-09-07T07:46:33.8418017Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-09-07T07:46:33.8418230Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-09-07T07:46:33.8418430Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-09-07T07:46:33.8418639Z * [new branch] gh/ydwu4/323/base -> origin/gh/ydwu4/323/base 2025-09-07T07:46:33.8418840Z * [new branch] gh/ydwu4/323/head -> origin/gh/ydwu4/323/head 2025-09-07T07:46:33.8419042Z * [new branch] gh/ydwu4/323/orig -> origin/gh/ydwu4/323/orig 2025-09-07T07:46:33.8419257Z * [new branch] gh/ydwu4/324/base -> origin/gh/ydwu4/324/base 2025-09-07T07:46:33.8419456Z * [new branch] gh/ydwu4/324/head -> origin/gh/ydwu4/324/head 2025-09-07T07:46:33.8419660Z * [new branch] gh/ydwu4/324/orig -> origin/gh/ydwu4/324/orig 2025-09-07T07:46:33.8419855Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-09-07T07:46:33.8420049Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-09-07T07:46:33.8420256Z * [new branch] gh/yf225/171/base -> origin/gh/yf225/171/base 2025-09-07T07:46:33.8420454Z * [new branch] gh/yf225/171/head -> origin/gh/yf225/171/head 2025-09-07T07:46:33.8420655Z * [new branch] gh/yf225/171/orig -> origin/gh/yf225/171/orig 2025-09-07T07:46:33.8420934Z * [new branch] gh/yf225/172/base -> origin/gh/yf225/172/base 2025-09-07T07:46:33.8421141Z * [new branch] gh/yf225/172/head -> origin/gh/yf225/172/head 2025-09-07T07:46:33.8421337Z * [new branch] gh/yf225/172/orig -> origin/gh/yf225/172/orig 2025-09-07T07:46:33.8421536Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-09-07T07:46:33.8421748Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-09-07T07:46:33.8421972Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-09-07T07:46:33.8422207Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-09-07T07:46:33.8422424Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-09-07T07:46:33.8422644Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-09-07T07:46:33.8422878Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-09-07T07:46:33.8423098Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-09-07T07:46:33.8423334Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-09-07T07:46:33.8423551Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-09-07T07:46:33.8423782Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-09-07T07:46:33.8423999Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-09-07T07:46:33.8424217Z * [new branch] gh/ysiraichi/79/base -> origin/gh/ysiraichi/79/base 2025-09-07T07:46:33.8424449Z * [new branch] gh/ysiraichi/79/head -> origin/gh/ysiraichi/79/head 2025-09-07T07:46:33.8424670Z * [new branch] gh/ysiraichi/79/orig -> origin/gh/ysiraichi/79/orig 2025-09-07T07:46:33.8424905Z * [new branch] gh/ysiraichi/88/base -> origin/gh/ysiraichi/88/base 2025-09-07T07:46:33.8425205Z * [new branch] gh/ysiraichi/88/head -> origin/gh/ysiraichi/88/head 2025-09-07T07:46:33.8425441Z * [new branch] gh/ysiraichi/88/orig -> origin/gh/ysiraichi/88/orig 2025-09-07T07:46:33.8425658Z * [new branch] gh/zhxchen17/25/base -> origin/gh/zhxchen17/25/base 2025-09-07T07:46:33.8425874Z * [new branch] gh/zhxchen17/25/head -> origin/gh/zhxchen17/25/head 2025-09-07T07:46:33.8426101Z * [new branch] gh/zhxchen17/25/orig -> origin/gh/zhxchen17/25/orig 2025-09-07T07:46:33.8426317Z * [new branch] gh/zhxchen17/31/base -> origin/gh/zhxchen17/31/base 2025-09-07T07:46:33.8426544Z * [new branch] gh/zhxchen17/31/head -> origin/gh/zhxchen17/31/head 2025-09-07T07:46:33.8426765Z * [new branch] gh/zhxchen17/31/orig -> origin/gh/zhxchen17/31/orig 2025-09-07T07:46:33.8426979Z * [new branch] gh/zhxchen17/34/base -> origin/gh/zhxchen17/34/base 2025-09-07T07:46:33.8427213Z * [new branch] gh/zhxchen17/34/head -> origin/gh/zhxchen17/34/head 2025-09-07T07:46:33.8427429Z * [new branch] gh/zhxchen17/35/base -> origin/gh/zhxchen17/35/base 2025-09-07T07:46:33.8427659Z * [new branch] gh/zhxchen17/35/head -> origin/gh/zhxchen17/35/head 2025-09-07T07:46:33.8427874Z * [new branch] gh/zhxchen17/37/base -> origin/gh/zhxchen17/37/base 2025-09-07T07:46:33.8428102Z * [new branch] gh/zhxchen17/37/head -> origin/gh/zhxchen17/37/head 2025-09-07T07:46:33.8428320Z * [new branch] gh/zhxchen17/37/orig -> origin/gh/zhxchen17/37/orig 2025-09-07T07:46:33.8428538Z * [new branch] gh/zhxchen17/38/base -> origin/gh/zhxchen17/38/base 2025-09-07T07:46:33.8428855Z * [new branch] gh/zhxchen17/38/head -> origin/gh/zhxchen17/38/head 2025-09-07T07:46:33.8429072Z * [new branch] gh/zhxchen17/38/orig -> origin/gh/zhxchen17/38/orig 2025-09-07T07:46:33.8429311Z * [new branch] gh/zhxchen17/39/base -> origin/gh/zhxchen17/39/base 2025-09-07T07:46:33.8429528Z * [new branch] gh/zhxchen17/39/head -> origin/gh/zhxchen17/39/head 2025-09-07T07:46:33.8429743Z * [new branch] gh/zhxchen17/39/orig -> origin/gh/zhxchen17/39/orig 2025-09-07T07:46:33.8429970Z * [new branch] gh/zhxchen17/40/base -> origin/gh/zhxchen17/40/base 2025-09-07T07:46:33.8430186Z * [new branch] gh/zhxchen17/40/head -> origin/gh/zhxchen17/40/head 2025-09-07T07:46:33.8430414Z * [new branch] gh/zhxchen17/40/orig -> origin/gh/zhxchen17/40/orig 2025-09-07T07:46:33.8430631Z * [new branch] gh/zhxchen17/41/base -> origin/gh/zhxchen17/41/base 2025-09-07T07:46:33.8430860Z * [new branch] gh/zhxchen17/41/head -> origin/gh/zhxchen17/41/head 2025-09-07T07:46:33.8431079Z * [new branch] gh/zhxchen17/41/orig -> origin/gh/zhxchen17/41/orig 2025-09-07T07:46:33.8431298Z * [new branch] gh/zhxchen17/42/base -> origin/gh/zhxchen17/42/base 2025-09-07T07:46:33.8431527Z * [new branch] gh/zhxchen17/42/head -> origin/gh/zhxchen17/42/head 2025-09-07T07:46:33.8431744Z * [new branch] gh/zhxchen17/42/orig -> origin/gh/zhxchen17/42/orig 2025-09-07T07:46:33.8431975Z * [new branch] gh/zhxchen17/43/base -> origin/gh/zhxchen17/43/base 2025-09-07T07:46:33.8432191Z * [new branch] gh/zhxchen17/43/head -> origin/gh/zhxchen17/43/head 2025-09-07T07:46:33.8432420Z * [new branch] gh/zhxchen17/43/orig -> origin/gh/zhxchen17/43/orig 2025-09-07T07:46:33.8432636Z * [new branch] gh/zhxchen17/44/base -> origin/gh/zhxchen17/44/base 2025-09-07T07:46:33.8432856Z * [new branch] gh/zhxchen17/44/head -> origin/gh/zhxchen17/44/head 2025-09-07T07:46:33.8433185Z * [new branch] gh/zhxchen17/44/orig -> origin/gh/zhxchen17/44/orig 2025-09-07T07:46:33.8433407Z * [new branch] gh/zhxchen17/45/base -> origin/gh/zhxchen17/45/base 2025-09-07T07:46:33.8433637Z * [new branch] gh/zhxchen17/45/head -> origin/gh/zhxchen17/45/head 2025-09-07T07:46:33.8433856Z * [new branch] gh/zhxchen17/45/orig -> origin/gh/zhxchen17/45/orig 2025-09-07T07:46:33.8434065Z * [new branch] gh/zklaus/10/base -> origin/gh/zklaus/10/base 2025-09-07T07:46:33.8434284Z * [new branch] gh/zklaus/10/head -> origin/gh/zklaus/10/head 2025-09-07T07:46:33.8434490Z * [new branch] gh/zklaus/10/orig -> origin/gh/zklaus/10/orig 2025-09-07T07:46:33.8434711Z * [new branch] gh/zklaus/11/base -> origin/gh/zklaus/11/base 2025-09-07T07:46:33.8434915Z * [new branch] gh/zklaus/11/head -> origin/gh/zklaus/11/head 2025-09-07T07:46:33.8435136Z * [new branch] gh/zklaus/11/orig -> origin/gh/zklaus/11/orig 2025-09-07T07:46:33.8435342Z * [new branch] gh/zklaus/12/base -> origin/gh/zklaus/12/base 2025-09-07T07:46:33.8435547Z * [new branch] gh/zklaus/12/head -> origin/gh/zklaus/12/head 2025-09-07T07:46:33.8435764Z * [new branch] gh/zklaus/12/orig -> origin/gh/zklaus/12/orig 2025-09-07T07:46:33.8435969Z * [new branch] gh/zklaus/14/base -> origin/gh/zklaus/14/base 2025-09-07T07:46:33.8436187Z * [new branch] gh/zklaus/14/head -> origin/gh/zklaus/14/head 2025-09-07T07:46:33.8436391Z * [new branch] gh/zklaus/14/orig -> origin/gh/zklaus/14/orig 2025-09-07T07:46:33.8436686Z * [new branch] gh/zklaus/15/base -> origin/gh/zklaus/15/base 2025-09-07T07:46:33.8436903Z * [new branch] gh/zklaus/15/head -> origin/gh/zklaus/15/head 2025-09-07T07:46:33.8437110Z * [new branch] gh/zklaus/15/orig -> origin/gh/zklaus/15/orig 2025-09-07T07:46:33.8437328Z * [new branch] gh/zklaus/16/base -> origin/gh/zklaus/16/base 2025-09-07T07:46:33.8437532Z * [new branch] gh/zklaus/16/head -> origin/gh/zklaus/16/head 2025-09-07T07:46:33.8437749Z * [new branch] gh/zklaus/16/orig -> origin/gh/zklaus/16/orig 2025-09-07T07:46:33.8437953Z * [new branch] gh/zklaus/17/base -> origin/gh/zklaus/17/base 2025-09-07T07:46:33.8438159Z * [new branch] gh/zklaus/17/head -> origin/gh/zklaus/17/head 2025-09-07T07:46:33.8438378Z * [new branch] gh/zklaus/17/orig -> origin/gh/zklaus/17/orig 2025-09-07T07:46:33.8438585Z * [new branch] gh/zklaus/18/base -> origin/gh/zklaus/18/base 2025-09-07T07:46:33.8438805Z * [new branch] gh/zklaus/18/head -> origin/gh/zklaus/18/head 2025-09-07T07:46:33.8439014Z * [new branch] gh/zklaus/18/orig -> origin/gh/zklaus/18/orig 2025-09-07T07:46:33.8439218Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-09-07T07:46:33.8439436Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-09-07T07:46:33.8439640Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-09-07T07:46:33.8439855Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-09-07T07:46:33.8440060Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-09-07T07:46:33.8440276Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-09-07T07:46:33.8440484Z * [new branch] gh/zklaus/7/base -> origin/gh/zklaus/7/base 2025-09-07T07:46:33.8440688Z * [new branch] gh/zklaus/7/head -> origin/gh/zklaus/7/head 2025-09-07T07:46:33.8440991Z * [new branch] gh/zklaus/7/orig -> origin/gh/zklaus/7/orig 2025-09-07T07:46:33.8441194Z * [new branch] gh/zklaus/9/base -> origin/gh/zklaus/9/base 2025-09-07T07:46:33.8441411Z * [new branch] gh/zklaus/9/head -> origin/gh/zklaus/9/head 2025-09-07T07:46:33.8441613Z * [new branch] gh/zklaus/9/orig -> origin/gh/zklaus/9/orig 2025-09-07T07:46:33.8441830Z * [new branch] gh/zou3519/1175/base -> origin/gh/zou3519/1175/base 2025-09-07T07:46:33.8442056Z * [new branch] gh/zou3519/1175/head -> origin/gh/zou3519/1175/head 2025-09-07T07:46:33.8442269Z * [new branch] gh/zou3519/1175/orig -> origin/gh/zou3519/1175/orig 2025-09-07T07:46:33.8442497Z * [new branch] gh/zou3519/1177/base -> origin/gh/zou3519/1177/base 2025-09-07T07:46:33.8442708Z * [new branch] gh/zou3519/1177/head -> origin/gh/zou3519/1177/head 2025-09-07T07:46:33.8443066Z * [new branch] gh/zou3519/1177/orig -> origin/gh/zou3519/1177/orig 2025-09-07T07:46:33.8443276Z * [new branch] gh/zou3519/1191/base -> origin/gh/zou3519/1191/base 2025-09-07T07:46:33.8443487Z * [new branch] gh/zou3519/1191/head -> origin/gh/zou3519/1191/head 2025-09-07T07:46:33.8443714Z * [new branch] gh/zou3519/1191/orig -> origin/gh/zou3519/1191/orig 2025-09-07T07:46:33.8443923Z * [new branch] gh/zou3519/1192/base -> origin/gh/zou3519/1192/base 2025-09-07T07:46:33.8444149Z * [new branch] gh/zou3519/1192/head -> origin/gh/zou3519/1192/head 2025-09-07T07:46:33.8444358Z * [new branch] gh/zou3519/1192/orig -> origin/gh/zou3519/1192/orig 2025-09-07T07:46:33.8444697Z * [new branch] gh/zou3519/1193/base -> origin/gh/zou3519/1193/base 2025-09-07T07:46:33.8444911Z * [new branch] gh/zou3519/1193/head -> origin/gh/zou3519/1193/head 2025-09-07T07:46:33.8445123Z * [new branch] gh/zou3519/1193/orig -> origin/gh/zou3519/1193/orig 2025-09-07T07:46:33.8445346Z * [new branch] gh/zou3519/1194/base -> origin/gh/zou3519/1194/base 2025-09-07T07:46:33.8445556Z * [new branch] gh/zou3519/1194/head -> origin/gh/zou3519/1194/head 2025-09-07T07:46:33.8445783Z * [new branch] gh/zou3519/1194/orig -> origin/gh/zou3519/1194/orig 2025-09-07T07:46:33.8445996Z * [new branch] gh/zou3519/1195/base -> origin/gh/zou3519/1195/base 2025-09-07T07:46:33.8446206Z * [new branch] gh/zou3519/1195/head -> origin/gh/zou3519/1195/head 2025-09-07T07:46:33.8446432Z * [new branch] gh/zou3519/1195/orig -> origin/gh/zou3519/1195/orig 2025-09-07T07:46:33.8446647Z * [new branch] gh/zou3519/1196/base -> origin/gh/zou3519/1196/base 2025-09-07T07:46:33.8446876Z * [new branch] gh/zou3519/1196/head -> origin/gh/zou3519/1196/head 2025-09-07T07:46:33.8447087Z * [new branch] gh/zou3519/1196/orig -> origin/gh/zou3519/1196/orig 2025-09-07T07:46:33.8447313Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-09-07T07:46:33.8447526Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-09-07T07:46:33.8447739Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-09-07T07:46:33.8447958Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-09-07T07:46:33.8448163Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-09-07T07:46:33.8448388Z * [new branch] gh/zpcore/10/base -> origin/gh/zpcore/10/base 2025-09-07T07:46:33.8448600Z * [new branch] gh/zpcore/10/head -> origin/gh/zpcore/10/head 2025-09-07T07:46:33.8448904Z * [new branch] gh/zpcore/10/orig -> origin/gh/zpcore/10/orig 2025-09-07T07:46:33.8449129Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-09-07T07:46:33.8449337Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-09-07T07:46:33.8449556Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-09-07T07:46:33.8449761Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-09-07T07:46:33.8449979Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-09-07T07:46:33.8450187Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-09-07T07:46:33.8450394Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-09-07T07:46:33.8450617Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-09-07T07:46:33.8450826Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-09-07T07:46:33.8451044Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-09-07T07:46:33.8451249Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-09-07T07:46:33.8451455Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-09-07T07:46:33.8451672Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-09-07T07:46:33.8451874Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-09-07T07:46:33.8452087Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-09-07T07:46:33.8452290Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-09-07T07:46:33.8452595Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-09-07T07:46:33.8452800Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-09-07T07:46:33.8453002Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-09-07T07:46:33.8453222Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-09-07T07:46:33.8453425Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-09-07T07:46:33.8453639Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-09-07T07:46:33.8453841Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-09-07T07:46:33.8454042Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-09-07T07:46:33.8454255Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-09-07T07:46:33.8454452Z * [new branch] google-main -> origin/google-main 2025-09-07T07:46:33.8454718Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-09-07T07:46:33.8454936Z * [new branch] guangyey/host_alloc -> origin/guangyey/host_alloc 2025-09-07T07:46:33.8455156Z * [new branch] guangyey/reimport -> origin/guangyey/reimport 2025-09-07T07:46:33.8455367Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-09-07T07:46:33.8455754Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-09-07T07:46:33.8456018Z * [new branch] haozhe/bf16-dynamic-shape -> origin/haozhe/bf16-dynamic-shape 2025-09-07T07:46:33.8456199Z * [new branch] hc_baseline -> origin/hc_baseline 2025-09-07T07:46:33.8456390Z * [new branch] hf_update -> origin/hf_update 2025-09-07T07:46:33.8456582Z * [new branch] hhh_decomp_mul -> origin/hhh_decomp_mul 2025-09-07T07:46:33.8456873Z * [new branch] hhh_rand -> origin/hhh_rand 2025-09-07T07:46:33.8457065Z * [new branch] hoy/mmsplitk -> origin/hoy/mmsplitk 2025-09-07T07:46:33.8457278Z * [new branch] hoy/triton-PR3973 -> origin/hoy/triton-PR3973 2025-09-07T07:46:33.8457680Z * [new branch] hoy/triton-coalescing-baseline -> origin/hoy/triton-coalescing-baseline 2025-09-07T07:46:33.8457939Z * [new branch] hoy/triton-coalescing-new -> origin/hoy/triton-coalescing-new 2025-09-07T07:46:33.8458201Z * [new branch] hoy/triton-coalescing-vec -> origin/hoy/triton-coalescing-vec 2025-09-07T07:46:33.8458422Z * [new branch] inductordecompfix -> origin/inductordecompfix 2025-09-07T07:46:33.8458595Z * [new branch] inline -> origin/inline 2025-09-07T07:46:33.8458783Z * [new branch] inlining -> origin/inlining 2025-09-07T07:46:33.8458994Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-09-07T07:46:33.8459247Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-09-07T07:46:33.8459442Z * [new branch] int8_sdpa -> origin/int8_sdpa 2025-09-07T07:46:33.8459658Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-09-07T07:46:33.8459853Z * [new branch] issue#58739 -> origin/issue#58739 2025-09-07T07:46:33.8460199Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-09-07T07:46:33.8460502Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-09-07T07:46:33.8460937Z * [new branch] jeanschmidt/disable_rocm_build_tests -> origin/jeanschmidt/disable_rocm_build_tests 2025-09-07T07:46:33.8461202Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-09-07T07:46:33.8461449Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-09-07T07:46:33.8461718Z * [new branch] justinchu/attention-tests -> origin/justinchu/attention-tests 2025-09-07T07:46:33.8461943Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-09-07T07:46:33.8462152Z * [new branch] justinchu/ort-122 -> origin/justinchu/ort-122 2025-09-07T07:46:33.8462407Z * [new branch] justinchuby/dynamo-true -> origin/justinchuby/dynamo-true 2025-09-07T07:46:33.8462622Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-09-07T07:46:33.8462820Z * [new branch] kainan_test -> origin/kainan_test 2025-09-07T07:46:33.8463019Z * [new branch] learnablebias -> origin/learnablebias 2025-09-07T07:46:33.8463317Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-09-07T07:46:33.8463626Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-09-07T07:46:33.8463861Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-09-07T07:46:33.8464168Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-09-07T07:46:33.8464400Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-09-07T07:46:33.8464618Z * [new branch] lintbuilddocker -> origin/lintbuilddocker 2025-09-07T07:46:33.8464808Z * [new branch] llama4-stable -> origin/llama4-stable 2025-09-07T07:46:33.8464990Z * [new branch] logdetfix -> origin/logdetfix 2025-09-07T07:46:33.8465202Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-09-07T07:46:33.8465524Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-09-07T07:46:33.8465785Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-09-07T07:46:33.8466060Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-09-07T07:46:33.8466381Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-09-07T07:46:33.8466749Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-09-07T07:46:33.8466990Z * [new branch] lucaskabela/issue_120648 -> origin/lucaskabela/issue_120648 2025-09-07T07:46:33.8467289Z * [new branch] lucaskabela/misc_typing_dynamo -> origin/lucaskabela/misc_typing_dynamo 2025-09-07T07:46:33.8467623Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-09-07T07:46:33.8468023Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-09-07T07:46:33.8468261Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-09-07T07:46:33.8468522Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-09-07T07:46:33.8468862Z * [new branch] lucaskabela/typing_symbolic_convert -> origin/lucaskabela/typing_symbolic_convert 2025-09-07T07:46:33.8469209Z * [new branch] lucaskabela/typing_utils_improvements -> origin/lucaskabela/typing_utils_improvements 2025-09-07T07:46:33.8469382Z * [new branch] main -> origin/main 2025-09-07T07:46:33.8469895Z * [new branch] main-enable-b200-distributed-tests -> origin/main-enable-b200-distributed-tests 2025-09-07T07:46:33.8470109Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-09-07T07:46:33.8470321Z * [new branch] malfet-patch-12 -> origin/malfet-patch-12 2025-09-07T07:46:33.8470527Z * [new branch] malfet-patch-14 -> origin/malfet-patch-14 2025-09-07T07:46:33.8470739Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-09-07T07:46:33.8470935Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-09-07T07:46:33.8471425Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-09-07T07:46:33.8471692Z * [new branch] malfet/delete-upsteam-cuda -> origin/malfet/delete-upsteam-cuda 2025-09-07T07:46:33.8471973Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-09-07T07:46:33.8472292Z * [new branch] manuel/test-ops-common-allow-mps -> origin/manuel/test-ops-common-allow-mps 2025-09-07T07:46:33.8472515Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-09-07T07:46:33.8472742Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-09-07T07:46:33.8472920Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-09-07T07:46:33.8473144Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-09-07T07:46:33.8473396Z * [new branch] mlazos/backup-test-branch -> origin/mlazos/backup-test-branch 2025-09-07T07:46:33.8473641Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-09-07T07:46:33.8473843Z * [new branch] mlazos/baseline -> origin/mlazos/baseline 2025-09-07T07:46:33.8474128Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-09-07T07:46:33.8474352Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-09-07T07:46:33.8474645Z * [new branch] mlazos/better-msg -> origin/mlazos/better-msg 2025-09-07T07:46:33.8474850Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-09-07T07:46:33.8475047Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-09-07T07:46:33.8475246Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-09-07T07:46:33.8475434Z * [new branch] mlazos/ck2 -> origin/mlazos/ck2 2025-09-07T07:46:33.8475658Z * [new branch] mlazos/combokernels -> origin/mlazos/combokernels 2025-09-07T07:46:33.8475882Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-09-07T07:46:33.8476104Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-09-07T07:46:33.8476358Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-09-07T07:46:33.8476654Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-09-07T07:46:33.8476876Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-09-07T07:46:33.8477131Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-09-07T07:46:33.8477343Z * [new branch] mlazos/data-gather -> origin/mlazos/data-gather 2025-09-07T07:46:33.8477569Z * [new branch] mlazos/data-ptrs2 -> origin/mlazos/data-ptrs2 2025-09-07T07:46:33.8477776Z * [new branch] mlazos/data-ptrs3 -> origin/mlazos/data-ptrs3 2025-09-07T07:46:33.8478018Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-09-07T07:46:33.8478341Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-09-07T07:46:33.8478551Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-09-07T07:46:33.8478764Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-09-07T07:46:33.8479006Z * [new branch] mlazos/disable-closures -> origin/mlazos/disable-closures 2025-09-07T07:46:33.8479233Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-09-07T07:46:33.8479433Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-09-07T07:46:33.8479636Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-09-07T07:46:33.8479830Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-09-07T07:46:33.8480041Z * [new branch] mlazos/exp_disable -> origin/mlazos/exp_disable 2025-09-07T07:46:33.8480303Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-09-07T07:46:33.8480519Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-09-07T07:46:33.8480709Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-09-07T07:46:33.8480907Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-09-07T07:46:33.8481138Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-09-07T07:46:33.8481351Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-09-07T07:46:33.8481552Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-09-07T07:46:33.8481756Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-09-07T07:46:33.8481947Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-09-07T07:46:33.8482154Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-09-07T07:46:33.8482343Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-09-07T07:46:33.8482639Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-09-07T07:46:33.8482994Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-09-07T07:46:33.8483200Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-09-07T07:46:33.8483418Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-09-07T07:46:33.8483606Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-09-07T07:46:33.8483793Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-09-07T07:46:33.8483986Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-09-07T07:46:33.8484170Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-09-07T07:46:33.8484361Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-09-07T07:46:33.8484542Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-09-07T07:46:33.8484720Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-09-07T07:46:33.8484910Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-09-07T07:46:33.8485085Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-09-07T07:46:33.8485269Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-09-07T07:46:33.8485449Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-09-07T07:46:33.8485633Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-09-07T07:46:33.8485808Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-09-07T07:46:33.8486093Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-09-07T07:46:33.8486283Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-09-07T07:46:33.8486497Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-09-07T07:46:33.8486743Z * [new branch] mlazos/init-per-param -> origin/mlazos/init-per-param 2025-09-07T07:46:33.8486968Z * [new branch] mlazos/init_per_param -> origin/mlazos/init_per_param 2025-09-07T07:46:33.8487181Z * [new branch] mlazos/less-guards -> origin/mlazos/less-guards 2025-09-07T07:46:33.8487429Z * [new branch] mlazos/lr-composibility -> origin/mlazos/lr-composibility 2025-09-07T07:46:33.8487608Z * [new branch] mlazos/main -> origin/mlazos/main 2025-09-07T07:46:33.8487889Z * [new branch] mlazos/main-test-enablement -> origin/mlazos/main-test-enablement 2025-09-07T07:46:33.8488076Z * [new branch] mlazos/main2 -> origin/mlazos/main2 2025-09-07T07:46:33.8488345Z * [new branch] mlazos/mark-static-update -> origin/mlazos/mark-static-update 2025-09-07T07:46:33.8488521Z * [new branch] mlazos/mcg -> origin/mlazos/mcg 2025-09-07T07:46:33.8488698Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-09-07T07:46:33.8488921Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-09-07T07:46:33.8489129Z * [new branch] mlazos/mlazos/ck2 -> origin/mlazos/mlazos/ck2 2025-09-07T07:46:33.8489434Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-09-07T07:46:33.8489702Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-09-07T07:46:33.8489902Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-09-07T07:46:33.8490120Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-09-07T07:46:33.8490427Z * [new branch] mlazos/more-tests -> origin/mlazos/more-tests 2025-09-07T07:46:33.8490631Z * [new branch] mlazos/no-cpp -> origin/mlazos/no-cpp 2025-09-07T07:46:33.8490913Z * [new branch] mlazos/no-init-group-handling -> origin/mlazos/no-init-group-handling 2025-09-07T07:46:33.8491120Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-09-07T07:46:33.8491351Z * [new branch] mlazos/opt-bench-exp2 -> origin/mlazos/opt-bench-exp2 2025-09-07T07:46:33.8491551Z * [new branch] mlazos/opt-incr -> origin/mlazos/opt-incr 2025-09-07T07:46:33.8491781Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-09-07T07:46:33.8491983Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-09-07T07:46:33.8492211Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-09-07T07:46:33.8492438Z * [new branch] mlazos/revert-inline -> origin/mlazos/revert-inline 2025-09-07T07:46:33.8492653Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-09-07T07:46:33.8492861Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-09-07T07:46:33.8493052Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-09-07T07:46:33.8493242Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-09-07T07:46:33.8493468Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-09-07T07:46:33.8493726Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-09-07T07:46:33.8494062Z * [new branch] mlazos/sub-param-fix -> origin/mlazos/sub-param-fix 2025-09-07T07:46:33.8494254Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-09-07T07:46:33.8494506Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-09-07T07:46:33.8494684Z * [new branch] mlazos/test -> origin/mlazos/test 2025-09-07T07:46:33.8494891Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-09-07T07:46:33.8495121Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-09-07T07:46:33.8495359Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-09-07T07:46:33.8495590Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-09-07T07:46:33.8495817Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-09-07T07:46:33.8496038Z * [new branch] mlazos/topo-fix -> origin/mlazos/topo-fix 2025-09-07T07:46:33.8496269Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-09-07T07:46:33.8496494Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-09-07T07:46:33.8496712Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-09-07T07:46:33.8496933Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-09-07T07:46:33.8497176Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-09-07T07:46:33.8497469Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-09-07T07:46:33.8497688Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-09-07T07:46:33.8497897Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-09-07T07:46:33.8498127Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-09-07T07:46:33.8498343Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-09-07T07:46:33.8498646Z * [new branch] modify-setupvllm -> origin/modify-setupvllm 2025-09-07T07:46:33.8498848Z * [new branch] module-shim -> origin/module-shim 2025-09-07T07:46:33.8499078Z * [new branch] move-theme-out-docker -> origin/move-theme-out-docker 2025-09-07T07:46:33.8499278Z * [new branch] msaroufim/be1 -> origin/msaroufim/be1 2025-09-07T07:46:33.8499485Z * [new branch] msaroufim/cn_path -> origin/msaroufim/cn_path 2025-09-07T07:46:33.8499751Z * [new branch] msaroufim/dtensorfusedadam -> origin/msaroufim/dtensorfusedadam 2025-09-07T07:46:33.8499967Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-09-07T07:46:33.8500177Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-09-07T07:46:33.8500358Z * [new branch] muon_dev -> origin/muon_dev 2025-09-07T07:46:33.8500539Z * [new branch] muon_dev_1 -> origin/muon_dev_1 2025-09-07T07:46:33.8500775Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-09-07T07:46:33.8500993Z * [new branch] nativert_numoutputs -> origin/nativert_numoutputs 2025-09-07T07:46:33.8501229Z * [new branch] new-modifiy-setupvllm -> origin/new-modifiy-setupvllm 2025-09-07T07:46:33.8501434Z * [new branch] new-setupvllm -> origin/new-setupvllm 2025-09-07T07:46:33.8501630Z * [new branch] new_zeros_dtype -> origin/new_zeros_dtype 2025-09-07T07:46:33.8501827Z * [new branch] newtest-base -> origin/newtest-base 2025-09-07T07:46:33.8502131Z * [new branch] ngimel/cat_perf1 -> origin/ngimel/cat_perf1 2025-09-07T07:46:33.8502338Z * [new branch] ngimel/einsum_fix -> origin/ngimel/einsum_fix 2025-09-07T07:46:33.8502582Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-09-07T07:46:33.8502799Z * [new branch] ngimel/fabric_check -> origin/ngimel/fabric_check 2025-09-07T07:46:33.8503014Z * [new branch] ngimel/fabric_fix -> origin/ngimel/fabric_fix 2025-09-07T07:46:33.8503276Z * [new branch] ngimel/fix_driver_init_error -> origin/ngimel/fix_driver_init_error 2025-09-07T07:46:33.8503540Z * [new branch] ngimel/fix_nccl_segment_seg -> origin/ngimel/fix_nccl_segment_seg 2025-09-07T07:46:33.8503722Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-09-07T07:46:33.8503928Z * [new branch] ngimel/modeguard -> origin/ngimel/modeguard 2025-09-07T07:46:33.8504165Z * [new branch] ngimel/multicast_fix -> origin/ngimel/multicast_fix 2025-09-07T07:46:33.8504403Z * [new branch] ngimel/rocm_handle_type -> origin/ngimel/rocm_handle_type 2025-09-07T07:46:33.8504657Z * [new branch] ngimel/symm_handle_fabric -> origin/ngimel/symm_handle_fabric 2025-09-07T07:46:33.8504889Z * [new branch] ngimel/unbind_multimem -> origin/ngimel/unbind_multimem 2025-09-07T07:46:33.8505076Z * [new branch] nightly -> origin/nightly 2025-09-07T07:46:33.8505304Z * [new branch] nmacchioni-patch-10 -> origin/nmacchioni-patch-10 2025-09-07T07:46:33.8505526Z * [new branch] nmacchioni-patch-7 -> origin/nmacchioni-patch-7 2025-09-07T07:46:33.8505761Z * [new branch] nmacchioni-patch-8 -> origin/nmacchioni-patch-8 2025-09-07T07:46:33.8505978Z * [new branch] nmacchioni-patch-9 -> origin/nmacchioni-patch-9 2025-09-07T07:46:33.8506217Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-09-07T07:46:33.8506513Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-09-07T07:46:33.8506686Z * [new branch] one-off -> origin/one-off 2025-09-07T07:46:33.8506903Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-09-07T07:46:33.8507104Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-09-07T07:46:33.8507315Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-09-07T07:46:33.8507518Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-09-07T07:46:33.8507729Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-09-07T07:46:33.8507929Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-09-07T07:46:33.8508132Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-09-07T07:46:33.8508346Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-09-07T07:46:33.8508541Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-09-07T07:46:33.8508750Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-09-07T07:46:33.8508945Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-09-07T07:46:33.8509142Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-09-07T07:46:33.8509350Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-09-07T07:46:33.8509544Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-09-07T07:46:33.8509754Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-09-07T07:46:33.8510046Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-09-07T07:46:33.8510258Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-09-07T07:46:33.8510457Z * [new branch] oulgen/fx_graph -> origin/oulgen/fx_graph 2025-09-07T07:46:33.8510650Z * [new branch] padded-tensor -> origin/padded-tensor 2025-09-07T07:46:33.8510825Z * [new branch] pca2 -> origin/pca2 2025-09-07T07:46:33.8511036Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-09-07T07:46:33.8511377Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-09-07T07:46:33.8511655Z * [new branch] pianpwk/invalidate_fake_memo -> origin/pianpwk/invalidate_fake_memo 2025-09-07T07:46:33.8511880Z * [new branch] pianpwk/max_1_strides -> origin/pianpwk/max_1_strides 2025-09-07T07:46:33.8512130Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-09-07T07:46:33.8512361Z * [new branch] pianpwk/nonzero_memo -> origin/pianpwk/nonzero_memo 2025-09-07T07:46:33.8512711Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-09-07T07:46:33.8513010Z * [new branch] pianpwk/oblivious_slice_forward -> origin/pianpwk/oblivious_slice_forward 2025-09-07T07:46:33.8513260Z * [new branch] pianpwk/oblivious_where -> origin/pianpwk/oblivious_where 2025-09-07T07:46:33.8513508Z * [new branch] pianpwk/param_static_pgo -> origin/pianpwk/param_static_pgo 2025-09-07T07:46:33.8513750Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-09-07T07:46:33.8514045Z * [new branch] pianpwk/remove_guard_fail_break -> origin/pianpwk/remove_guard_fail_break 2025-09-07T07:46:33.8514304Z * [new branch] pianpwk/slice_fresh_symbols -> origin/pianpwk/slice_fresh_symbols 2025-09-07T07:46:33.8514636Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-09-07T07:46:33.8514974Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-09-07T07:46:33.8515244Z * [new branch] pianpwk/test_slice_fake_impl -> origin/pianpwk/test_slice_fake_impl 2025-09-07T07:46:33.8515527Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-09-07T07:46:33.8515816Z * [new branch] pianpwk/unbacked_channels_last -> origin/pianpwk/unbacked_channels_last 2025-09-07T07:46:33.8516095Z * [new branch] pianpwk/unbacked_safe_conv1d -> origin/pianpwk/unbacked_safe_conv1d 2025-09-07T07:46:33.8516354Z * [new branch] pianpwk/unbacked_sdpa_flash -> origin/pianpwk/unbacked_sdpa_flash 2025-09-07T07:46:33.8516631Z * [new branch] pianpwk/unbacked_should_swap -> origin/pianpwk/unbacked_should_swap 2025-09-07T07:46:33.8516917Z * [new branch] pianpwk/unbacked_should_swap_2 -> origin/pianpwk/unbacked_should_swap_2 2025-09-07T07:46:33.8517218Z * [new branch] pianpwk/unbacked_slice_binding -> origin/pianpwk/unbacked_slice_binding 2025-09-07T07:46:33.8517499Z * [new branch] pianpwk/unbacked_slice_forward -> origin/pianpwk/unbacked_slice_forward 2025-09-07T07:46:33.8517725Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-09-07T07:46:33.8517966Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-09-07T07:46:33.8518230Z * [new branch] pianpwk/whitelist_optimizer -> origin/pianpwk/whitelist_optimizer 2025-09-07T07:46:33.8518432Z * [new branch] pin-torchao -> origin/pin-torchao 2025-09-07T07:46:33.8518765Z * [new branch] piz/fall_back_missing_0716 -> origin/piz/fall_back_missing_0716 2025-09-07T07:46:33.8519000Z * [new branch] piz/improve_scatter_0808 -> origin/piz/improve_scatter_0808 2025-09-07T07:46:33.8519214Z * [new branch] pool-separate -> origin/pool-separate 2025-09-07T07:46:33.8519387Z * [new branch] pr-156087 -> origin/pr-156087 2025-09-07T07:46:33.8519571Z * [new branch] pr/131860 -> origin/pr/131860 2025-09-07T07:46:33.8519771Z * [new branch] predispatch_to -> origin/predispatch_to 2025-09-07T07:46:33.8519974Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-09-07T07:46:33.8520162Z * [new branch] pyobjectslot -> origin/pyobjectslot 2025-09-07T07:46:33.8520400Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-09-07T07:46:33.8520657Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-09-07T07:46:33.8520837Z * [new branch] quint-bits -> origin/quint-bits 2025-09-07T07:46:33.8521037Z * [new branch] release/1.10 -> origin/release/1.10 2025-09-07T07:46:33.8521218Z * [new branch] release/1.11 -> origin/release/1.11 2025-09-07T07:46:33.8521399Z * [new branch] release/1.12 -> origin/release/1.12 2025-09-07T07:46:33.8521592Z * [new branch] release/1.13 -> origin/release/1.13 2025-09-07T07:46:33.8521771Z * [new branch] release/1.4 -> origin/release/1.4 2025-09-07T07:46:33.8521966Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-09-07T07:46:33.8522144Z * [new branch] release/1.5 -> origin/release/1.5 2025-09-07T07:46:33.8522332Z * [new branch] release/1.6 -> origin/release/1.6 2025-09-07T07:46:33.8522517Z * [new branch] release/1.7 -> origin/release/1.7 2025-09-07T07:46:33.8522782Z * [new branch] release/1.8 -> origin/release/1.8 2025-09-07T07:46:33.8523144Z * [new branch] release/1.9 -> origin/release/1.9 2025-09-07T07:46:33.8523327Z * [new branch] release/2.0 -> origin/release/2.0 2025-09-07T07:46:33.8523518Z * [new branch] release/2.1 -> origin/release/2.1 2025-09-07T07:46:33.8523697Z * [new branch] release/2.2 -> origin/release/2.2 2025-09-07T07:46:33.8523873Z * [new branch] release/2.3 -> origin/release/2.3 2025-09-07T07:46:33.8524064Z * [new branch] release/2.4 -> origin/release/2.4 2025-09-07T07:46:33.8524241Z * [new branch] release/2.5 -> origin/release/2.5 2025-09-07T07:46:33.8524436Z * [new branch] release/2.6 -> origin/release/2.6 2025-09-07T07:46:33.8524615Z * [new branch] release/2.7 -> origin/release/2.7 2025-09-07T07:46:33.8524804Z * [new branch] release/2.8 -> origin/release/2.8 2025-09-07T07:46:33.8524993Z * [new branch] release_notes -> origin/release_notes 2025-09-07T07:46:33.8525244Z * [new branch] remove-actionable-label -> origin/remove-actionable-label 2025-09-07T07:46:33.8525429Z * [new branch] remove-ao -> origin/remove-ao 2025-09-07T07:46:33.8525694Z * [new branch] removedeprecatedvllmtest -> origin/removedeprecatedvllmtest 2025-09-07T07:46:33.8526048Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-09-07T07:46:33.8526397Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-09-07T07:46:33.8526870Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-09-07T07:46:33.8527218Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-09-07T07:46:33.8527558Z * [new branch] replace-pytorch-labs-20250812-204125 -> origin/replace-pytorch-labs-20250812-204125 2025-09-07T07:46:33.8527913Z * [new branch] replace-pytorch-labs-20250812-205624 -> origin/replace-pytorch-labs-20250812-205624 2025-09-07T07:46:33.8528310Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-09-07T07:46:33.8528782Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-09-07T07:46:33.8529145Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-09-07T07:46:33.8529656Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-09-07T07:46:33.8529864Z * [new branch] rocm-monitoring -> origin/rocm-monitoring 2025-09-07T07:46:33.8530075Z * [new branch] ruisi/relax_memory -> origin/ruisi/relax_memory 2025-09-07T07:46:33.8530387Z * [new branch] run-torchbench-smoke-test-h100 -> origin/run-torchbench-smoke-test-h100 2025-09-07T07:46:33.8530803Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-09-07T07:46:33.8531071Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-09-07T07:46:33.8531299Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-09-07T07:46:33.8531488Z * [new branch] rzou/njt -> origin/rzou/njt 2025-09-07T07:46:33.8531665Z * [new branch] rzou/pca -> origin/rzou/pca 2025-09-07T07:46:33.8531857Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-09-07T07:46:33.8532205Z * [new branch] rzou/setup_context -> origin/rzou/setup_context 2025-09-07T07:46:33.8532576Z * [new branch] sanchitintel/refactor_aten_int8_woq_gemm -> origin/sanchitintel/refactor_aten_int8_woq_gemm 2025-09-07T07:46:33.8533086Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-09-07T07:46:33.8533345Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-09-07T07:46:33.8533507Z * [new branch] save -> origin/save 2025-09-07T07:46:33.8533702Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-09-07T07:46:33.8533931Z * [new branch] seemethere-patch-1 -> origin/seemethere-patch-1 2025-09-07T07:46:33.8534122Z * [new branch] setupvllm -> origin/setupvllm 2025-09-07T07:46:33.8534329Z * [new branch] share_and_pin_fork -> origin/share_and_pin_fork 2025-09-07T07:46:33.8534567Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-09-07T07:46:33.8534793Z * [new branch] shikaili_fp8_allgather -> origin/shikaili_fp8_allgather 2025-09-07T07:46:33.8535015Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-09-07T07:46:33.8535255Z * [new branch] shoumikhin-patch-12 -> origin/shoumikhin-patch-12 2025-09-07T07:46:33.8535496Z * [new branch] simplify-fq-per-channel -> origin/simplify-fq-per-channel 2025-09-07T07:46:33.8535724Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-09-07T07:46:33.8536049Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-09-07T07:46:33.8536260Z * [new branch] sqzhang/flight4 -> origin/sqzhang/flight4 2025-09-07T07:46:33.8536494Z * [new branch] sqzhang/flight4plus -> origin/sqzhang/flight4plus 2025-09-07T07:46:33.8536747Z * [new branch] sraikund/record_funct_test -> origin/sraikund/record_funct_test 2025-09-07T07:46:33.8536958Z * [new branch] sraikund16/test -> origin/sraikund16/test 2025-09-07T07:46:33.8537219Z * [new branch] stablize-compilation-time -> origin/stablize-compilation-time 2025-09-07T07:46:33.8537542Z * [new branch] standalone-templates -> origin/standalone-templates 2025-09-07T07:46:33.8537802Z * [new branch] standalone_package_weights -> origin/standalone_package_weights 2025-09-07T07:46:33.8538014Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-09-07T07:46:33.8538220Z * [new branch] subgraph_fuse -> origin/subgraph_fuse 2025-09-07T07:46:33.8538475Z * [new branch] support-uv-in-collect_env -> origin/support-uv-in-collect_env 2025-09-07T07:46:33.8538657Z * [new branch] sve-poc -> origin/sve-poc 2025-09-07T07:46:33.8538865Z * [new branch] svekars-patch-1 -> origin/svekars-patch-1 2025-09-07T07:46:33.8539062Z * [new branch] switch-bn -> origin/switch-bn 2025-09-07T07:46:33.8539310Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-09-07T07:46:33.8539553Z * [new branch] tenpercent/ck_rocm_ci_v3 -> origin/tenpercent/ck_rocm_ci_v3 2025-09-07T07:46:33.8539802Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-09-07T07:46:33.8539977Z * [new branch] test-7054 -> origin/test-7054 2025-09-07T07:46:33.8540234Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-09-07T07:46:33.8540627Z * [new branch] test-myst-markdown-docstring -> origin/test-myst-markdown-docstring 2025-09-07T07:46:33.8540809Z * [new branch] test-old -> origin/test-old 2025-09-07T07:46:33.8541119Z * [new branch] test-vec-migration-internally -> origin/test-vec-migration-internally 2025-09-07T07:46:33.8541306Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-09-07T07:46:33.8541513Z * [new branch] test/inductor -> origin/test/inductor 2025-09-07T07:46:33.8541770Z * [new branch] tianren/flex_paged_attn_fix -> origin/tianren/flex_paged_attn_fix 2025-09-07T07:46:33.8542076Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-09-07T07:46:33.8542266Z * [new branch] tianren/test -> origin/tianren/test 2025-09-07T07:46:33.8542496Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-09-07T07:46:33.8542710Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-09-07T07:46:33.8542952Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-09-07T07:46:33.8543204Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-09-07T07:46:33.8543407Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-09-07T07:46:33.8543596Z * [new branch] tree_vec_base -> origin/tree_vec_base 2025-09-07T07:46:33.8543791Z * [new branch] triton-update -> origin/triton-update 2025-09-07T07:46:33.8543977Z * [new branch] triton_kernel -> origin/triton_kernel 2025-09-07T07:46:33.8544201Z * [new branch] triton_kernel_perf -> origin/triton_kernel_perf 2025-09-07T07:46:33.8544473Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-09-07T07:46:33.8544784Z * [new branch] tweak-transformer-dependabot -> origin/tweak-transformer-dependabot 2025-09-07T07:46:33.8544957Z * [new branch] type_dec -> origin/type_dec 2025-09-07T07:46:33.8545223Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-09-07T07:46:33.8545632Z * [new branch] update-audio-commit-hash/16818882925-1712-1 -> origin/update-audio-commit-hash/16818882925-1712-1 2025-09-07T07:46:33.8546025Z * [new branch] update-audio-commit-hash/16895560422-1720-1 -> origin/update-audio-commit-hash/16895560422-1720-1 2025-09-07T07:46:33.8546427Z * [new branch] update-audio-commit-hash/16924174496-1738-1 -> origin/update-audio-commit-hash/16924174496-1738-1 2025-09-07T07:46:33.8546819Z * [new branch] update-audio-commit-hash/17002010821-1749-1 -> origin/update-audio-commit-hash/17002010821-1749-1 2025-09-07T07:46:33.8547219Z * [new branch] update-audio-commit-hash/17056004427-1766-1 -> origin/update-audio-commit-hash/17056004427-1766-1 2025-09-07T07:46:33.8547605Z * [new branch] update-audio-commit-hash/17085054029-1767-1 -> origin/update-audio-commit-hash/17085054029-1767-1 2025-09-07T07:46:33.8547991Z * [new branch] update-audio-commit-hash/17142507405-1771-1 -> origin/update-audio-commit-hash/17142507405-1771-1 2025-09-07T07:46:33.8548395Z * [new branch] update-audio-commit-hash/17168762740-1773-1 -> origin/update-audio-commit-hash/17168762740-1773-1 2025-09-07T07:46:33.8548781Z * [new branch] update-audio-commit-hash/17311174639-1780-1 -> origin/update-audio-commit-hash/17311174639-1780-1 2025-09-07T07:46:33.8549177Z * [new branch] update-audio-commit-hash/17336898740-1781-1 -> origin/update-audio-commit-hash/17336898740-1781-1 2025-09-07T07:46:33.8549566Z * [new branch] update-audio-commit-hash/17389727684-1786-1 -> origin/update-audio-commit-hash/17389727684-1786-1 2025-09-07T07:46:33.8789678Z * [new branch] update-audio-commit-hash/17449538142-1790-1 -> origin/update-audio-commit-hash/17449538142-1790-1 2025-09-07T07:46:33.8790113Z * [new branch] update-audio-commit-hash/17507351808-1794-1 -> origin/update-audio-commit-hash/17507351808-1794-1 2025-09-07T07:46:33.8790400Z * [new branch] update-dynamic-shapes-doc -> origin/update-dynamic-shapes-doc 2025-09-07T07:46:33.8790850Z * [new branch] update-executorch-commit-hash/15694981040-1626-1 -> origin/update-executorch-commit-hash/15694981040-1626-1 2025-09-07T07:46:33.8791250Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-09-07T07:46:33.8791658Z * [new branch] update-vision-commit-hash/15336342773-1607-1 -> origin/update-vision-commit-hash/15336342773-1607-1 2025-09-07T07:46:33.8792043Z * [new branch] update-vllm-commit-hash/16737365217-1704-1 -> origin/update-vllm-commit-hash/16737365217-1704-1 2025-09-07T07:46:33.8792434Z * [new branch] update-vllm-commit-hash/16843157111-1713-1 -> origin/update-vllm-commit-hash/16843157111-1713-1 2025-09-07T07:46:33.8792811Z * [new branch] update-vllm-commit-hash/16855312394-1714-1 -> origin/update-vllm-commit-hash/16855312394-1714-1 2025-09-07T07:46:33.8793191Z * [new branch] update-vllm-commit-hash/16924174496-1738-1 -> origin/update-vllm-commit-hash/16924174496-1738-1 2025-09-07T07:46:33.8793565Z * [new branch] update-vllm-commit-hash/16952608705-1745-1 -> origin/update-vllm-commit-hash/16952608705-1745-1 2025-09-07T07:46:33.8793936Z * [new branch] update-vllm-commit-hash/16979836546-1748-1 -> origin/update-vllm-commit-hash/16979836546-1748-1 2025-09-07T07:46:33.8794431Z * [new branch] update-vllm-commit-hash/17014576881-1756-1 -> origin/update-vllm-commit-hash/17014576881-1756-1 2025-09-07T07:46:33.8794809Z * [new branch] update-vllm-commit-hash/17027830869-1761-1 -> origin/update-vllm-commit-hash/17027830869-1761-1 2025-09-07T07:46:33.8795197Z * [new branch] update-vllm-commit-hash/17056004427-1766-1 -> origin/update-vllm-commit-hash/17056004427-1766-1 2025-09-07T07:46:33.8795568Z * [new branch] update-vllm-commit-hash/17085054029-1767-1 -> origin/update-vllm-commit-hash/17085054029-1767-1 2025-09-07T07:46:33.8795955Z * [new branch] update-vllm-commit-hash/17113610216-1768-1 -> origin/update-vllm-commit-hash/17113610216-1768-1 2025-09-07T07:46:33.8796329Z * [new branch] update-vllm-commit-hash/17142507405-1771-1 -> origin/update-vllm-commit-hash/17142507405-1771-1 2025-09-07T07:46:33.8796718Z * [new branch] update-vllm-commit-hash/17181878974-1774-1 -> origin/update-vllm-commit-hash/17181878974-1774-1 2025-09-07T07:46:33.8797098Z * [new branch] update-vllm-commit-hash/17311174639-1780-1 -> origin/update-vllm-commit-hash/17311174639-1780-1 2025-09-07T07:46:33.8797474Z * [new branch] update-vllm-commit-hash/17336898740-1781-1 -> origin/update-vllm-commit-hash/17336898740-1781-1 2025-09-07T07:46:33.8797856Z * [new branch] update-vllm-commit-hash/17364352302-1785-1 -> origin/update-vllm-commit-hash/17364352302-1785-1 2025-09-07T07:46:33.8798230Z * [new branch] update-vllm-commit-hash/17389727684-1786-1 -> origin/update-vllm-commit-hash/17389727684-1786-1 2025-09-07T07:46:33.8798613Z * [new branch] update-vllm-commit-hash/17449538142-1790-1 -> origin/update-vllm-commit-hash/17449538142-1790-1 2025-09-07T07:46:33.8798987Z * [new branch] update-vllm-commit-hash/17480069797-1791-1 -> origin/update-vllm-commit-hash/17480069797-1791-1 2025-09-07T07:46:33.8799370Z * [new branch] update-vllm-commit-hash/17507351808-1794-1 -> origin/update-vllm-commit-hash/17507351808-1794-1 2025-09-07T07:46:33.8799842Z * [new branch] update-xla-commit-hash/16873912760-198-1 -> origin/update-xla-commit-hash/16873912760-198-1 2025-09-07T07:46:33.8800207Z * [new branch] update-xla-commit-hash/17034266655-199-1 -> origin/update-xla-commit-hash/17034266655-199-1 2025-09-07T07:46:33.8800580Z * [new branch] update-xla-commit-hash/17202464405-200-1 -> origin/update-xla-commit-hash/17202464405-200-1 2025-09-07T07:46:33.8800948Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-09-07T07:46:33.8801193Z * [new branch] update_executorch_pin -> origin/update_executorch_pin 2025-09-07T07:46:33.8801448Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-09-07T07:46:33.8801708Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-09-07T07:46:33.8801962Z * [new branch] update_slow_tests_1752478971 -> origin/update_slow_tests_1752478971 2025-09-07T07:46:33.8802215Z * [new branch] update_slow_tests_1755502951 -> origin/update_slow_tests_1755502951 2025-09-07T07:46:33.8802477Z * [new branch] update_slow_tests_1756107664 -> origin/update_slow_tests_1756107664 2025-09-07T07:46:33.8802729Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-09-07T07:46:33.8803100Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-09-07T07:46:33.8803365Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-09-07T07:46:33.8803549Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-09-07T07:46:33.8803726Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-09-07T07:46:33.8804002Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-09-07T07:46:33.8804182Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-09-07T07:46:33.8804352Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-09-07T07:46:33.8804529Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-09-07T07:46:33.8804693Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-09-07T07:46:33.8804880Z * [new branch] validate_fn -> origin/validate_fn 2025-09-07T07:46:33.8805102Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-09-07T07:46:33.8805304Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-09-07T07:46:33.8805507Z * [new branch] viable/strict -> origin/viable/strict 2025-09-07T07:46:33.8805696Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-09-07T07:46:33.8805890Z * [new branch] vllmpin -> origin/vllmpin 2025-09-07T07:46:33.8806134Z * [new branch] wdvr/conda_devcontainer -> origin/wdvr/conda_devcontainer 2025-09-07T07:46:33.8806332Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-09-07T07:46:33.8806561Z * [new branch] weight_sharing_cpp -> origin/weight_sharing_cpp 2025-09-07T07:46:33.8806744Z * [new branch] whc/flight4 -> origin/whc/flight4 2025-09-07T07:46:33.8806948Z * [new branch] whc/flight51 -> origin/whc/flight51 2025-09-07T07:46:33.8807138Z * [new branch] whc/flight53 -> origin/whc/flight53 2025-09-07T07:46:33.8807322Z * [new branch] whc/stage2 -> origin/whc/stage2 2025-09-07T07:46:33.8807516Z * [new branch] whc/uneven -> origin/whc/uneven 2025-09-07T07:46:33.8807729Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-09-07T07:46:33.8807927Z * [new branch] win_warnings -> origin/win_warnings 2025-09-07T07:46:33.8808249Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-09-07T07:46:33.8808475Z * [new branch] workonoldcommit -> origin/workonoldcommit 2025-09-07T07:46:33.8808906Z * [new branch] wychi-autotune-prune-configs-by-shared-mem -> origin/wychi-autotune-prune-configs-by-shared-mem 2025-09-07T07:46:33.8809090Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-09-07T07:46:33.8809317Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-09-07T07:46:33.8809752Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-09-07T07:46:33.8809983Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-09-07T07:46:33.8810195Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-09-07T07:46:33.8810401Z * [new branch] xmfan/ca_api -> origin/xmfan/ca_api 2025-09-07T07:46:33.8810590Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-09-07T07:46:33.8810776Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-09-07T07:46:33.8811006Z * [new branch] xmfan/ca_cudagraphs -> origin/xmfan/ca_cudagraphs 2025-09-07T07:46:33.8811209Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-09-07T07:46:33.8811419Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-09-07T07:46:33.8811636Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-09-07T07:46:33.8811858Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-09-07T07:46:33.8812140Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-09-07T07:46:33.8812336Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-09-07T07:46:33.8812538Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-09-07T07:46:33.8812742Z * [new branch] xmfan/ca_mem_base -> origin/xmfan/ca_mem_base 2025-09-07T07:46:33.8812951Z * [new branch] xmfan/ca_mem_fix -> origin/xmfan/ca_mem_fix 2025-09-07T07:46:33.8813162Z * [new branch] xmfan/ca_memory_fix -> origin/xmfan/ca_memory_fix 2025-09-07T07:46:33.8813409Z * [new branch] xmfan/ca_memory_fix_rebased -> origin/xmfan/ca_memory_fix_rebased 2025-09-07T07:46:33.8813676Z * [new branch] xmfan/ca_memory_fix_rebased2 -> origin/xmfan/ca_memory_fix_rebased2 2025-09-07T07:46:33.8813896Z * [new branch] xmfan/ca_move_to_cuda -> origin/xmfan/ca_move_to_cuda 2025-09-07T07:46:33.8814105Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-09-07T07:46:33.8814313Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-09-07T07:46:33.8814571Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-09-07T07:46:33.8814780Z * [new branch] xmfan/ca_scalar -> origin/xmfan/ca_scalar 2025-09-07T07:46:33.8815021Z * [new branch] xmfan/ca_subclass_mem_fix -> origin/xmfan/ca_subclass_mem_fix 2025-09-07T07:46:33.8815236Z * [new branch] xmfan/ca_warm_mem -> origin/xmfan/ca_warm_mem 2025-09-07T07:46:33.8815454Z * [new branch] xmfan/ca_warm_mem_base -> origin/xmfan/ca_warm_mem_base 2025-09-07T07:46:33.8815667Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-09-07T07:46:33.8815871Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-09-07T07:46:33.8816067Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-09-07T07:46:33.8816364Z * [new branch] xmfan/cacu_may27 -> origin/xmfan/cacu_may27 2025-09-07T07:46:33.8816611Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-09-07T07:46:33.8816909Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-09-07T07:46:33.8817117Z * [new branch] xmfan/issue_123374 -> origin/xmfan/issue_123374 2025-09-07T07:46:33.8817680Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-09-07T07:46:33.8818119Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-09-07T07:46:33.8818344Z * [new branch] xmfan/segfault_test -> origin/xmfan/segfault_test 2025-09-07T07:46:33.8818570Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-09-07T07:46:33.8818760Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-09-07T07:46:33.8818954Z * [new branch] xmfan/test -> origin/xmfan/test 2025-09-07T07:46:33.8819204Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-09-07T07:46:33.8819447Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-09-07T07:46:33.8819716Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-09-07T07:46:33.8819929Z * [new branch] yihan_quantization -> origin/yihan_quantization 2025-09-07T07:46:33.8820218Z * [new branch] yiming/add_jit_trace_benchmark -> origin/yiming/add_jit_trace_benchmark 2025-09-07T07:46:33.8820598Z * [new branch] yiming/add_nativert_benchmark -> origin/yiming/add_nativert_benchmark 2025-09-07T07:46:33.8820818Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-09-07T07:46:33.8821030Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-09-07T07:46:33.8821292Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-09-07T07:46:33.8821520Z * [new branch] zainr/git-push-v2 -> origin/zainr/git-push-v2 2025-09-07T07:46:33.8821763Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-09-07T07:46:33.8821954Z * [new branch] zainr/test -> origin/zainr/test 2025-09-07T07:46:33.8822136Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-09-07T07:46:33.8822346Z * [new branch] zainr/unstable -> origin/zainr/unstable 2025-09-07T07:46:33.8822569Z * [new branch] zainr/unstable-xla -> origin/zainr/unstable-xla 2025-09-07T07:46:33.8822792Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-09-07T07:46:33.8822968Z * [new branch] zb2p -> origin/zb2p 2025-09-07T07:46:33.8823197Z * [new branch] zero_grad_optimization -> origin/zero_grad_optimization 2025-09-07T07:46:33.8823458Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-09-07T07:46:33.8823680Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-09-07T07:46:33.8823908Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-09-07T07:46:33.8824103Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-09-07T07:46:33.8824590Z * [new tag] bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug -> bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug 2025-09-07T07:46:33.8824777Z * [new tag] ci/binaries/77164 -> ci/binaries/77164 2025-09-07T07:46:33.8825047Z * [new tag] ciflow/binaries/156049 -> ciflow/binaries/156049 2025-09-07T07:46:33.8825249Z * [new tag] ciflow/binaries/156712 -> ciflow/binaries/156712 2025-09-07T07:46:33.8825436Z * [new tag] ciflow/binaries/157432 -> ciflow/binaries/157432 2025-09-07T07:46:33.8825620Z * [new tag] ciflow/binaries/157685 -> ciflow/binaries/157685 2025-09-07T07:46:33.8825820Z * [new tag] ciflow/binaries/157689 -> ciflow/binaries/157689 2025-09-07T07:46:33.8826005Z * [new tag] ciflow/binaries/158104 -> ciflow/binaries/158104 2025-09-07T07:46:33.8826204Z * [new tag] ciflow/binaries/160229 -> ciflow/binaries/160229 2025-09-07T07:46:33.8826395Z * [new tag] ciflow/binaries/160720 -> ciflow/binaries/160720 2025-09-07T07:46:33.8826591Z * [new tag] ciflow/binaries/162080 -> ciflow/binaries/162080 2025-09-07T07:46:33.8826784Z * [new tag] ciflow/binaries/162329 -> ciflow/binaries/162329 2025-09-07T07:46:33.8827034Z * [new tag] ciflow/binaries_libtorch/156049 -> ciflow/binaries_libtorch/156049 2025-09-07T07:46:33.8827296Z * [new tag] ciflow/binaries_libtorch/156711 -> ciflow/binaries_libtorch/156711 2025-09-07T07:46:33.8827542Z * [new tag] ciflow/binaries_libtorch/157432 -> ciflow/binaries_libtorch/157432 2025-09-07T07:46:33.8827772Z * [new tag] ciflow/binaries_wheel/156049 -> ciflow/binaries_wheel/156049 2025-09-07T07:46:33.8827989Z * [new tag] ciflow/binaries_wheel/156711 -> ciflow/binaries_wheel/156711 2025-09-07T07:46:33.8828206Z * [new tag] ciflow/binaries_wheel/157432 -> ciflow/binaries_wheel/157432 2025-09-07T07:46:33.8828523Z * [new tag] ciflow/binaries_wheel/162136 -> ciflow/binaries_wheel/162136 2025-09-07T07:46:33.8828742Z * [new tag] ciflow/binaries_wheel/162252 -> ciflow/binaries_wheel/162252 2025-09-07T07:46:33.8828971Z * [new tag] ciflow/binaries_wheel/162325 -> ciflow/binaries_wheel/162325 2025-09-07T07:46:33.8829209Z * [new tag] ciflow/h100-distributed/156703 -> ciflow/h100-distributed/156703 2025-09-07T07:46:33.8829430Z * [new tag] ciflow/h100-symm-mem/157635 -> ciflow/h100-symm-mem/157635 2025-09-07T07:46:33.8829633Z * [new tag] ciflow/h100-symm-mem/161984 -> ciflow/h100-symm-mem/161984 2025-09-07T07:46:33.8829835Z * [new tag] ciflow/h100-symm-mem/162003 -> ciflow/h100-symm-mem/162003 2025-09-07T07:46:33.8830052Z * [new tag] ciflow/h100-symm-mem/162011 -> ciflow/h100-symm-mem/162011 2025-09-07T07:46:33.8830256Z * [new tag] ciflow/h100-symm-mem/162026 -> ciflow/h100-symm-mem/162026 2025-09-07T07:46:33.8830475Z * [new tag] ciflow/h100-symm-mem/162033 -> ciflow/h100-symm-mem/162033 2025-09-07T07:46:33.8830679Z * [new tag] ciflow/h100-symm-mem/162040 -> ciflow/h100-symm-mem/162040 2025-09-07T07:46:33.8830894Z * [new tag] ciflow/h100-symm-mem/162041 -> ciflow/h100-symm-mem/162041 2025-09-07T07:46:33.8831097Z * [new tag] ciflow/h100-symm-mem/162142 -> ciflow/h100-symm-mem/162142 2025-09-07T07:46:33.8831298Z * [new tag] ciflow/h100-symm-mem/162150 -> ciflow/h100-symm-mem/162150 2025-09-07T07:46:33.8831520Z * [new tag] ciflow/h100-symm-mem/162243 -> ciflow/h100-symm-mem/162243 2025-09-07T07:46:33.8831722Z * [new tag] ciflow/h100-symm-mem/162320 -> ciflow/h100-symm-mem/162320 2025-09-07T07:46:33.8831905Z * [new tag] ciflow/h100/159158 -> ciflow/h100/159158 2025-09-07T07:46:33.8832075Z * [new tag] ciflow/h100/160480 -> ciflow/h100/160480 2025-09-07T07:46:33.8832237Z * [new tag] ciflow/h100/161749 -> ciflow/h100/161749 2025-09-07T07:46:33.8832510Z * [new tag] ciflow/h100/162022 -> ciflow/h100/162022 2025-09-07T07:46:33.8832676Z * [new tag] ciflow/h100/162278 -> ciflow/h100/162278 2025-09-07T07:46:33.8833090Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/156592 -> ciflow/inductor-perf-test-nightly-rocm/156592 2025-09-07T07:46:33.8833433Z * [new tag] ciflow/inductor-perf-test-nightly/156592 -> ciflow/inductor-perf-test-nightly/156592 2025-09-07T07:46:33.8833706Z * [new tag] ciflow/inductor-periodic/162063 -> ciflow/inductor-periodic/162063 2025-09-07T07:46:33.8833962Z * [new tag] ciflow/inductor-periodic/162227 -> ciflow/inductor-periodic/162227 2025-09-07T07:46:33.8834212Z * [new tag] ciflow/inductor-periodic/162323 -> ciflow/inductor-periodic/162323 2025-09-07T07:46:33.8834443Z * [new tag] ciflow/inductor-rocm/154170 -> ciflow/inductor-rocm/154170 2025-09-07T07:46:33.8834663Z * [new tag] ciflow/inductor-rocm/159146 -> ciflow/inductor-rocm/159146 2025-09-07T07:46:33.8834889Z * [new tag] ciflow/inductor-rocm/159158 -> ciflow/inductor-rocm/159158 2025-09-07T07:46:33.8835101Z * [new tag] ciflow/inductor-rocm/161715 -> ciflow/inductor-rocm/161715 2025-09-07T07:46:33.8835321Z * [new tag] ciflow/inductor-rocm/162053 -> ciflow/inductor-rocm/162053 2025-09-07T07:46:33.8835530Z * [new tag] ciflow/inductor-rocm/162056 -> ciflow/inductor-rocm/162056 2025-09-07T07:46:33.8835722Z * [new tag] ciflow/inductor/137400 -> ciflow/inductor/137400 2025-09-07T07:46:33.8835925Z * [new tag] ciflow/inductor/148180 -> ciflow/inductor/148180 2025-09-07T07:46:33.8836117Z * [new tag] ciflow/inductor/148328 -> ciflow/inductor/148328 2025-09-07T07:46:33.8836405Z * [new tag] ciflow/inductor/148484 -> ciflow/inductor/148484 2025-09-07T07:46:33.8836598Z * [new tag] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-09-07T07:46:33.8836784Z * [new tag] ciflow/inductor/152624 -> ciflow/inductor/152624 2025-09-07T07:46:33.8844212Z * [new tag] ciflow/inductor/154694 -> ciflow/inductor/154694 2025-09-07T07:46:33.8844503Z * [new tag] ciflow/inductor/156049 -> ciflow/inductor/156049 2025-09-07T07:46:33.8844711Z * [new tag] ciflow/inductor/156592 -> ciflow/inductor/156592 2025-09-07T07:46:33.8844900Z * [new tag] ciflow/inductor/157635 -> ciflow/inductor/157635 2025-09-07T07:46:33.8845096Z * [new tag] ciflow/inductor/157685 -> ciflow/inductor/157685 2025-09-07T07:46:33.8845281Z * [new tag] ciflow/inductor/157686 -> ciflow/inductor/157686 2025-09-07T07:46:33.8845490Z * [new tag] ciflow/inductor/157689 -> ciflow/inductor/157689 2025-09-07T07:46:33.8845680Z * [new tag] ciflow/inductor/157699 -> ciflow/inductor/157699 2025-09-07T07:46:33.8845867Z * [new tag] ciflow/inductor/157743 -> ciflow/inductor/157743 2025-09-07T07:46:33.8846064Z * [new tag] ciflow/inductor/157994 -> ciflow/inductor/157994 2025-09-07T07:46:33.8846247Z * [new tag] ciflow/inductor/158091 -> ciflow/inductor/158091 2025-09-07T07:46:33.8846444Z * [new tag] ciflow/inductor/158104 -> ciflow/inductor/158104 2025-09-07T07:46:33.8846628Z * [new tag] ciflow/inductor/158404 -> ciflow/inductor/158404 2025-09-07T07:46:33.8846812Z * [new tag] ciflow/inductor/158647 -> ciflow/inductor/158647 2025-09-07T07:46:33.8847005Z * [new tag] ciflow/inductor/158932 -> ciflow/inductor/158932 2025-09-07T07:46:33.8847192Z * [new tag] ciflow/inductor/159146 -> ciflow/inductor/159146 2025-09-07T07:46:33.8847692Z * [new tag] ciflow/inductor/159158 -> ciflow/inductor/159158 2025-09-07T07:46:33.8847878Z * [new tag] ciflow/inductor/159274 -> ciflow/inductor/159274 2025-09-07T07:46:33.8848075Z * [new tag] ciflow/inductor/159664 -> ciflow/inductor/159664 2025-09-07T07:46:33.8848264Z * [new tag] ciflow/inductor/159778 -> ciflow/inductor/159778 2025-09-07T07:46:33.8848444Z * [new tag] ciflow/inductor/159835 -> ciflow/inductor/159835 2025-09-07T07:46:33.8848644Z * [new tag] ciflow/inductor/159944 -> ciflow/inductor/159944 2025-09-07T07:46:33.8848826Z * [new tag] ciflow/inductor/160161 -> ciflow/inductor/160161 2025-09-07T07:46:33.8849023Z * [new tag] ciflow/inductor/160174 -> ciflow/inductor/160174 2025-09-07T07:46:33.8849214Z * [new tag] ciflow/inductor/160323 -> ciflow/inductor/160323 2025-09-07T07:46:33.8849398Z * [new tag] ciflow/inductor/160324 -> ciflow/inductor/160324 2025-09-07T07:46:33.8849593Z * [new tag] ciflow/inductor/160325 -> ciflow/inductor/160325 2025-09-07T07:46:33.8849778Z * [new tag] ciflow/inductor/160326 -> ciflow/inductor/160326 2025-09-07T07:46:33.8849975Z * [new tag] ciflow/inductor/160327 -> ciflow/inductor/160327 2025-09-07T07:46:33.8850163Z * [new tag] ciflow/inductor/160328 -> ciflow/inductor/160328 2025-09-07T07:46:33.8850356Z * [new tag] ciflow/inductor/160329 -> ciflow/inductor/160329 2025-09-07T07:46:33.8850542Z * [new tag] ciflow/inductor/160480 -> ciflow/inductor/160480 2025-09-07T07:46:33.8850726Z * [new tag] ciflow/inductor/160532 -> ciflow/inductor/160532 2025-09-07T07:46:33.8851034Z * [new tag] ciflow/inductor/160539 -> ciflow/inductor/160539 2025-09-07T07:46:33.8851220Z * [new tag] ciflow/inductor/160580 -> ciflow/inductor/160580 2025-09-07T07:46:33.8851422Z * [new tag] ciflow/inductor/160685 -> ciflow/inductor/160685 2025-09-07T07:46:33.8851607Z * [new tag] ciflow/inductor/160686 -> ciflow/inductor/160686 2025-09-07T07:46:33.8851792Z * [new tag] ciflow/inductor/160687 -> ciflow/inductor/160687 2025-09-07T07:46:33.8851988Z * [new tag] ciflow/inductor/160688 -> ciflow/inductor/160688 2025-09-07T07:46:33.8852172Z * [new tag] ciflow/inductor/160690 -> ciflow/inductor/160690 2025-09-07T07:46:33.8852364Z * [new tag] ciflow/inductor/160706 -> ciflow/inductor/160706 2025-09-07T07:46:33.8852549Z * [new tag] ciflow/inductor/160729 -> ciflow/inductor/160729 2025-09-07T07:46:33.8852744Z * [new tag] ciflow/inductor/160798 -> ciflow/inductor/160798 2025-09-07T07:46:33.8852933Z * [new tag] ciflow/inductor/160836 -> ciflow/inductor/160836 2025-09-07T07:46:33.8853121Z * [new tag] ciflow/inductor/160843 -> ciflow/inductor/160843 2025-09-07T07:46:33.8853315Z * [new tag] ciflow/inductor/160869 -> ciflow/inductor/160869 2025-09-07T07:46:33.8853501Z * [new tag] ciflow/inductor/160920 -> ciflow/inductor/160920 2025-09-07T07:46:33.8853695Z * [new tag] ciflow/inductor/160928 -> ciflow/inductor/160928 2025-09-07T07:46:33.8853877Z * [new tag] ciflow/inductor/160943 -> ciflow/inductor/160943 2025-09-07T07:46:33.8854061Z * [new tag] ciflow/inductor/161092 -> ciflow/inductor/161092 2025-09-07T07:46:33.8854254Z * [new tag] ciflow/inductor/161093 -> ciflow/inductor/161093 2025-09-07T07:46:33.8854442Z * [new tag] ciflow/inductor/161109 -> ciflow/inductor/161109 2025-09-07T07:46:33.8854634Z * [new tag] ciflow/inductor/161118 -> ciflow/inductor/161118 2025-09-07T07:46:33.8854904Z * [new tag] ciflow/inductor/161178 -> ciflow/inductor/161178 2025-09-07T07:46:33.8855100Z * [new tag] ciflow/inductor/161246 -> ciflow/inductor/161246 2025-09-07T07:46:33.8855283Z * [new tag] ciflow/inductor/161349 -> ciflow/inductor/161349 2025-09-07T07:46:33.8855468Z * [new tag] ciflow/inductor/161350 -> ciflow/inductor/161350 2025-09-07T07:46:33.8855663Z * [new tag] ciflow/inductor/161351 -> ciflow/inductor/161351 2025-09-07T07:46:33.8855848Z * [new tag] ciflow/inductor/161397 -> ciflow/inductor/161397 2025-09-07T07:46:33.8856041Z * [new tag] ciflow/inductor/161404 -> ciflow/inductor/161404 2025-09-07T07:46:33.8856224Z * [new tag] ciflow/inductor/161405 -> ciflow/inductor/161405 2025-09-07T07:46:33.8856409Z * [new tag] ciflow/inductor/161406 -> ciflow/inductor/161406 2025-09-07T07:46:33.8856605Z * [new tag] ciflow/inductor/161410 -> ciflow/inductor/161410 2025-09-07T07:46:33.8856790Z * [new tag] ciflow/inductor/161414 -> ciflow/inductor/161414 2025-09-07T07:46:33.8856984Z * [new tag] ciflow/inductor/161442 -> ciflow/inductor/161442 2025-09-07T07:46:33.8857165Z * [new tag] ciflow/inductor/161458 -> ciflow/inductor/161458 2025-09-07T07:46:33.8857466Z * [new tag] ciflow/inductor/161468 -> ciflow/inductor/161468 2025-09-07T07:46:33.8857653Z * [new tag] ciflow/inductor/161469 -> ciflow/inductor/161469 2025-09-07T07:46:33.8857836Z * [new tag] ciflow/inductor/161485 -> ciflow/inductor/161485 2025-09-07T07:46:33.8858030Z * [new tag] ciflow/inductor/161499 -> ciflow/inductor/161499 2025-09-07T07:46:33.8858314Z * [new tag] ciflow/inductor/161534 -> ciflow/inductor/161534 2025-09-07T07:46:33.8858515Z * [new tag] ciflow/inductor/161595 -> ciflow/inductor/161595 2025-09-07T07:46:33.8858697Z * [new tag] ciflow/inductor/161596 -> ciflow/inductor/161596 2025-09-07T07:46:33.8858881Z * [new tag] ciflow/inductor/161630 -> ciflow/inductor/161630 2025-09-07T07:46:33.8859072Z * [new tag] ciflow/inductor/161667 -> ciflow/inductor/161667 2025-09-07T07:46:33.8859255Z * [new tag] ciflow/inductor/161670 -> ciflow/inductor/161670 2025-09-07T07:46:33.8859451Z * [new tag] ciflow/inductor/161673 -> ciflow/inductor/161673 2025-09-07T07:46:33.8859633Z * [new tag] ciflow/inductor/161674 -> ciflow/inductor/161674 2025-09-07T07:46:33.8859822Z * [new tag] ciflow/inductor/161675 -> ciflow/inductor/161675 2025-09-07T07:46:33.8860005Z * [new tag] ciflow/inductor/161693 -> ciflow/inductor/161693 2025-09-07T07:46:33.8860207Z * [new tag] ciflow/inductor/161695 -> ciflow/inductor/161695 2025-09-07T07:46:33.8860390Z * [new tag] ciflow/inductor/161715 -> ciflow/inductor/161715 2025-09-07T07:46:33.8860572Z * [new tag] ciflow/inductor/161730 -> ciflow/inductor/161730 2025-09-07T07:46:33.8860766Z * [new tag] ciflow/inductor/161732 -> ciflow/inductor/161732 2025-09-07T07:46:33.8860947Z * [new tag] ciflow/inductor/161744 -> ciflow/inductor/161744 2025-09-07T07:46:33.8861143Z * [new tag] ciflow/inductor/161746 -> ciflow/inductor/161746 2025-09-07T07:46:33.8861328Z * [new tag] ciflow/inductor/161747 -> ciflow/inductor/161747 2025-09-07T07:46:33.8861512Z * [new tag] ciflow/inductor/161819 -> ciflow/inductor/161819 2025-09-07T07:46:33.8861711Z * [new tag] ciflow/inductor/161821 -> ciflow/inductor/161821 2025-09-07T07:46:33.8861974Z * [new tag] ciflow/inductor/161828 -> ciflow/inductor/161828 2025-09-07T07:46:33.8862169Z * [new tag] ciflow/inductor/161879 -> ciflow/inductor/161879 2025-09-07T07:46:33.8862351Z * [new tag] ciflow/inductor/161880 -> ciflow/inductor/161880 2025-09-07T07:46:33.8862544Z * [new tag] ciflow/inductor/161881 -> ciflow/inductor/161881 2025-09-07T07:46:33.8862729Z * [new tag] ciflow/inductor/161907 -> ciflow/inductor/161907 2025-09-07T07:46:33.8862913Z * [new tag] ciflow/inductor/161914 -> ciflow/inductor/161914 2025-09-07T07:46:33.8863106Z * [new tag] ciflow/inductor/161924 -> ciflow/inductor/161924 2025-09-07T07:46:33.8863289Z * [new tag] ciflow/inductor/161936 -> ciflow/inductor/161936 2025-09-07T07:46:33.8863489Z * [new tag] ciflow/inductor/161938 -> ciflow/inductor/161938 2025-09-07T07:46:33.8863677Z * [new tag] ciflow/inductor/161939 -> ciflow/inductor/161939 2025-09-07T07:46:33.8863864Z * [new tag] ciflow/inductor/161940 -> ciflow/inductor/161940 2025-09-07T07:46:33.8864061Z * [new tag] ciflow/inductor/161955 -> ciflow/inductor/161955 2025-09-07T07:46:33.8864246Z * [new tag] ciflow/inductor/161957 -> ciflow/inductor/161957 2025-09-07T07:46:33.8864440Z * [new tag] ciflow/inductor/161975 -> ciflow/inductor/161975 2025-09-07T07:46:33.8864627Z * [new tag] ciflow/inductor/161977 -> ciflow/inductor/161977 2025-09-07T07:46:33.8864820Z * [new tag] ciflow/inductor/161978 -> ciflow/inductor/161978 2025-09-07T07:46:33.8865007Z * [new tag] ciflow/inductor/161979 -> ciflow/inductor/161979 2025-09-07T07:46:33.8865294Z * [new tag] ciflow/inductor/161980 -> ciflow/inductor/161980 2025-09-07T07:46:33.8865487Z * [new tag] ciflow/inductor/161988 -> ciflow/inductor/161988 2025-09-07T07:46:33.8865674Z * [new tag] ciflow/inductor/161994 -> ciflow/inductor/161994 2025-09-07T07:46:33.8865867Z * [new tag] ciflow/inductor/162013 -> ciflow/inductor/162013 2025-09-07T07:46:33.8866052Z * [new tag] ciflow/inductor/162014 -> ciflow/inductor/162014 2025-09-07T07:46:33.8866233Z * [new tag] ciflow/inductor/162017 -> ciflow/inductor/162017 2025-09-07T07:46:33.8866427Z * [new tag] ciflow/inductor/162021 -> ciflow/inductor/162021 2025-09-07T07:46:33.8866610Z * [new tag] ciflow/inductor/162023 -> ciflow/inductor/162023 2025-09-07T07:46:33.8866806Z * [new tag] ciflow/inductor/162027 -> ciflow/inductor/162027 2025-09-07T07:46:33.8866996Z * [new tag] ciflow/inductor/162029 -> ciflow/inductor/162029 2025-09-07T07:46:33.8867189Z * [new tag] ciflow/inductor/162030 -> ciflow/inductor/162030 2025-09-07T07:46:33.8867377Z * [new tag] ciflow/inductor/162031 -> ciflow/inductor/162031 2025-09-07T07:46:33.8867561Z * [new tag] ciflow/inductor/162033 -> ciflow/inductor/162033 2025-09-07T07:46:33.8867758Z * [new tag] ciflow/inductor/162052 -> ciflow/inductor/162052 2025-09-07T07:46:33.8867944Z * [new tag] ciflow/inductor/162053 -> ciflow/inductor/162053 2025-09-07T07:46:33.8868137Z * [new tag] ciflow/inductor/162056 -> ciflow/inductor/162056 2025-09-07T07:46:33.8868321Z * [new tag] ciflow/inductor/162063 -> ciflow/inductor/162063 2025-09-07T07:46:33.8868503Z * [new tag] ciflow/inductor/162066 -> ciflow/inductor/162066 2025-09-07T07:46:33.8868696Z * [new tag] ciflow/inductor/162068 -> ciflow/inductor/162068 2025-09-07T07:46:33.8868883Z * [new tag] ciflow/inductor/162081 -> ciflow/inductor/162081 2025-09-07T07:46:33.8869164Z * [new tag] ciflow/inductor/162088 -> ciflow/inductor/162088 2025-09-07T07:46:33.8869353Z * [new tag] ciflow/inductor/162089 -> ciflow/inductor/162089 2025-09-07T07:46:33.8869544Z * [new tag] ciflow/inductor/162094 -> ciflow/inductor/162094 2025-09-07T07:46:33.8869725Z * [new tag] ciflow/inductor/162098 -> ciflow/inductor/162098 2025-09-07T07:46:33.8869907Z * [new tag] ciflow/inductor/162101 -> ciflow/inductor/162101 2025-09-07T07:46:33.8870099Z * [new tag] ciflow/inductor/162102 -> ciflow/inductor/162102 2025-09-07T07:46:33.8870282Z * [new tag] ciflow/inductor/162104 -> ciflow/inductor/162104 2025-09-07T07:46:33.8870473Z * [new tag] ciflow/inductor/162106 -> ciflow/inductor/162106 2025-09-07T07:46:33.8870662Z * [new tag] ciflow/inductor/162108 -> ciflow/inductor/162108 2025-09-07T07:46:33.8870857Z * [new tag] ciflow/inductor/162126 -> ciflow/inductor/162126 2025-09-07T07:46:33.8871039Z * [new tag] ciflow/inductor/162149 -> ciflow/inductor/162149 2025-09-07T07:46:33.8871222Z * [new tag] ciflow/inductor/162164 -> ciflow/inductor/162164 2025-09-07T07:46:33.8871417Z * [new tag] ciflow/inductor/162166 -> ciflow/inductor/162166 2025-09-07T07:46:33.8871604Z * [new tag] ciflow/inductor/162169 -> ciflow/inductor/162169 2025-09-07T07:46:33.8871798Z * [new tag] ciflow/inductor/162170 -> ciflow/inductor/162170 2025-09-07T07:46:33.8871980Z * [new tag] ciflow/inductor/162171 -> ciflow/inductor/162171 2025-09-07T07:46:33.8872161Z * [new tag] ciflow/inductor/162183 -> ciflow/inductor/162183 2025-09-07T07:46:33.8872434Z * [new tag] ciflow/inductor/162189 -> ciflow/inductor/162189 2025-09-07T07:46:33.8872622Z * [new tag] ciflow/inductor/162190 -> ciflow/inductor/162190 2025-09-07T07:46:33.8872815Z * [new tag] ciflow/inductor/162191 -> ciflow/inductor/162191 2025-09-07T07:46:33.8873001Z * [new tag] ciflow/inductor/162194 -> ciflow/inductor/162194 2025-09-07T07:46:33.8873195Z * [new tag] ciflow/inductor/162200 -> ciflow/inductor/162200 2025-09-07T07:46:33.8873377Z * [new tag] ciflow/inductor/162201 -> ciflow/inductor/162201 2025-09-07T07:46:33.8873559Z * [new tag] ciflow/inductor/162208 -> ciflow/inductor/162208 2025-09-07T07:46:33.8873757Z * [new tag] ciflow/inductor/162211 -> ciflow/inductor/162211 2025-09-07T07:46:33.8873939Z * [new tag] ciflow/inductor/162216 -> ciflow/inductor/162216 2025-09-07T07:46:33.8874136Z * [new tag] ciflow/inductor/162220 -> ciflow/inductor/162220 2025-09-07T07:46:33.8874322Z * [new tag] ciflow/inductor/162222 -> ciflow/inductor/162222 2025-09-07T07:46:33.8874507Z * [new tag] ciflow/inductor/162227 -> ciflow/inductor/162227 2025-09-07T07:46:33.8874701Z * [new tag] ciflow/inductor/162238 -> ciflow/inductor/162238 2025-09-07T07:46:33.8874885Z * [new tag] ciflow/inductor/162239 -> ciflow/inductor/162239 2025-09-07T07:46:33.8875075Z * [new tag] ciflow/inductor/162240 -> ciflow/inductor/162240 2025-09-07T07:46:33.8875259Z * [new tag] ciflow/inductor/162244 -> ciflow/inductor/162244 2025-09-07T07:46:33.8875450Z * [new tag] ciflow/inductor/162245 -> ciflow/inductor/162245 2025-09-07T07:46:33.8875629Z * [new tag] ciflow/inductor/162262 -> ciflow/inductor/162262 2025-09-07T07:46:33.8875814Z * [new tag] ciflow/inductor/162275 -> ciflow/inductor/162275 2025-09-07T07:46:33.8876006Z * [new tag] ciflow/inductor/162278 -> ciflow/inductor/162278 2025-09-07T07:46:33.8876269Z * [new tag] ciflow/inductor/162284 -> ciflow/inductor/162284 2025-09-07T07:46:33.8876465Z * [new tag] ciflow/inductor/162286 -> ciflow/inductor/162286 2025-09-07T07:46:33.8876647Z * [new tag] ciflow/inductor/162288 -> ciflow/inductor/162288 2025-09-07T07:46:33.8876828Z * [new tag] ciflow/inductor/162293 -> ciflow/inductor/162293 2025-09-07T07:46:33.8877019Z * [new tag] ciflow/inductor/162294 -> ciflow/inductor/162294 2025-09-07T07:46:33.8877201Z * [new tag] ciflow/inductor/162295 -> ciflow/inductor/162295 2025-09-07T07:46:33.8877394Z * [new tag] ciflow/inductor/162296 -> ciflow/inductor/162296 2025-09-07T07:46:33.8877583Z * [new tag] ciflow/inductor/162298 -> ciflow/inductor/162298 2025-09-07T07:46:33.8877781Z * [new tag] ciflow/inductor/162307 -> ciflow/inductor/162307 2025-09-07T07:46:33.8877965Z * [new tag] ciflow/inductor/162309 -> ciflow/inductor/162309 2025-09-07T07:46:33.8878150Z * [new tag] ciflow/inductor/162311 -> ciflow/inductor/162311 2025-09-07T07:46:33.8878341Z * [new tag] ciflow/inductor/162312 -> ciflow/inductor/162312 2025-09-07T07:46:33.8878522Z * [new tag] ciflow/inductor/162315 -> ciflow/inductor/162315 2025-09-07T07:46:33.8878715Z * [new tag] ciflow/inductor/162316 -> ciflow/inductor/162316 2025-09-07T07:46:33.8878901Z * [new tag] ciflow/inductor/162318 -> ciflow/inductor/162318 2025-09-07T07:46:33.8879081Z * [new tag] ciflow/inductor/162323 -> ciflow/inductor/162323 2025-09-07T07:46:33.8879356Z * [new tag] ciflow/inductor/162341 -> ciflow/inductor/162341 2025-09-07T07:46:33.8879537Z * [new tag] ciflow/inductor/162345 -> ciflow/inductor/162345 2025-09-07T07:46:33.8879747Z * [new tag] ciflow/inductor/3b9a386 -> ciflow/inductor/3b9a386 2025-09-07T07:46:33.8879941Z * [new tag] ciflow/inductor/3d4b92b -> ciflow/inductor/3d4b92b 2025-09-07T07:46:33.8880138Z * [new tag] ciflow/inductor/d224ac7 -> ciflow/inductor/d224ac7 2025-09-07T07:46:33.8880348Z * [new tag] ciflow/linux-aarch64/157994 -> ciflow/linux-aarch64/157994 2025-09-07T07:46:33.8880556Z * [new tag] ciflow/linux-aarch64/159737 -> ciflow/linux-aarch64/159737 2025-09-07T07:46:33.8880771Z * [new tag] ciflow/linux-aarch64/160078 -> ciflow/linux-aarch64/160078 2025-09-07T07:46:33.8880941Z * [new tag] ciflow/mps/157553 -> ciflow/mps/157553 2025-09-07T07:46:33.8881124Z * [new tag] ciflow/mps/157635 -> ciflow/mps/157635 2025-09-07T07:46:33.8881291Z * [new tag] ciflow/mps/161988 -> ciflow/mps/161988 2025-09-07T07:46:33.8881458Z * [new tag] ciflow/mps/162108 -> ciflow/mps/162108 2025-09-07T07:46:33.8881633Z * [new tag] ciflow/mps/162153 -> ciflow/mps/162153 2025-09-07T07:46:33.8881796Z * [new tag] ciflow/mps/162281 -> ciflow/mps/162281 2025-09-07T07:46:33.8881990Z * [new tag] ciflow/nightly/156049 -> ciflow/nightly/156049 2025-09-07T07:46:33.8882171Z * [new tag] ciflow/nightly/158104 -> ciflow/nightly/158104 2025-09-07T07:46:33.8882390Z * [new tag] ciflow/op-benchmark/157994 -> ciflow/op-benchmark/157994 2025-09-07T07:46:33.8882666Z * [new tag] ciflow/periodic-rocm-mi300/161529 -> ciflow/periodic-rocm-mi300/161529 2025-09-07T07:46:33.8883073Z * [new tag] ciflow/periodic-rocm-mi300/161715 -> ciflow/periodic-rocm-mi300/161715 2025-09-07T07:46:33.8883293Z * [new tag] ciflow/periodic/054a2fd -> ciflow/periodic/054a2fd 2025-09-07T07:46:33.8883587Z * [new tag] ciflow/periodic/156703 -> ciflow/periodic/156703 2025-09-07T07:46:33.8883789Z * [new tag] ciflow/periodic/161715 -> ciflow/periodic/161715 2025-09-07T07:46:33.8883979Z * [new tag] ciflow/periodic/162021 -> ciflow/periodic/162021 2025-09-07T07:46:33.8884164Z * [new tag] ciflow/periodic/162323 -> ciflow/periodic/162323 2025-09-07T07:46:33.8884369Z * [new tag] ciflow/periodic/2a6d37d -> ciflow/periodic/2a6d37d 2025-09-07T07:46:33.8884561Z * [new tag] ciflow/periodic/317eeb8 -> ciflow/periodic/317eeb8 2025-09-07T07:46:33.8884751Z * [new tag] ciflow/periodic/3c32 -> ciflow/periodic/3c32 2025-09-07T07:46:33.8884942Z * [new tag] ciflow/periodic/3e98831 -> ciflow/periodic/3e98831 2025-09-07T07:46:33.8885169Z * [new tag] ciflow/periodic/94512-point -> ciflow/periodic/94512-point 2025-09-07T07:46:33.8885411Z * [new tag] ciflow/periodic/csl/test87519 -> ciflow/periodic/csl/test87519 2025-09-07T07:46:33.8885638Z * [new tag] ciflow/periodic/csltest88275 -> ciflow/periodic/csltest88275 2025-09-07T07:46:33.8885879Z * [new tag] ciflow/periodic/csltest88761 -> ciflow/periodic/csltest88761 2025-09-07T07:46:33.8886096Z * [new tag] ciflow/periodic/release_1.12 -> ciflow/periodic/release_1.12 2025-09-07T07:46:33.8886349Z * [new tag] ciflow/periodic/release_1.12.0 -> ciflow/periodic/release_1.12.0 2025-09-07T07:46:33.8886560Z * [new tag] ciflow/periodic/sha-ec5b83 -> ciflow/periodic/sha-ec5b83 2025-09-07T07:46:33.8886761Z * [new tag] ciflow/rocm-mi300/154170 -> ciflow/rocm-mi300/154170 2025-09-07T07:46:33.8887039Z * [new tag] ciflow/rocm-mi300/158747 -> ciflow/rocm-mi300/158747 2025-09-07T07:46:33.8887223Z * [new tag] ciflow/rocm-mi300/159146 -> ciflow/rocm-mi300/159146 2025-09-07T07:46:33.8887420Z * [new tag] ciflow/rocm-mi300/159158 -> ciflow/rocm-mi300/159158 2025-09-07T07:46:33.8887609Z * [new tag] ciflow/rocm-mi300/161715 -> ciflow/rocm-mi300/161715 2025-09-07T07:46:33.8887804Z * [new tag] ciflow/rocm-mi300/161957 -> ciflow/rocm-mi300/161957 2025-09-07T07:46:33.8887989Z * [new tag] ciflow/rocm-mi300/162053 -> ciflow/rocm-mi300/162053 2025-09-07T07:46:33.8888175Z * [new tag] ciflow/rocm-mi300/162056 -> ciflow/rocm-mi300/162056 2025-09-07T07:46:33.8888370Z * [new tag] ciflow/rocm-mi300/162112 -> ciflow/rocm-mi300/162112 2025-09-07T07:46:33.8888554Z * [new tag] ciflow/rocm-mi300/162245 -> ciflow/rocm-mi300/162245 2025-09-07T07:46:33.8888756Z * [new tag] ciflow/rocm-mi300/162278 -> ciflow/rocm-mi300/162278 2025-09-07T07:46:33.8888943Z * [new tag] ciflow/rocm-mi300/162288 -> ciflow/rocm-mi300/162288 2025-09-07T07:46:33.8889139Z * [new tag] ciflow/rocm-mi355/162053 -> ciflow/rocm-mi355/162053 2025-09-07T07:46:33.8889324Z * [new tag] ciflow/rocm-mi355/162056 -> ciflow/rocm-mi355/162056 2025-09-07T07:46:33.8889492Z * [new tag] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-09-07T07:46:33.8889671Z * [new tag] ciflow/rocm/154170 -> ciflow/rocm/154170 2025-09-07T07:46:33.8889836Z * [new tag] ciflow/rocm/156491 -> ciflow/rocm/156491 2025-09-07T07:46:33.8890009Z * [new tag] ciflow/rocm/156592 -> ciflow/rocm/156592 2025-09-07T07:46:33.8890173Z * [new tag] ciflow/rocm/158747 -> ciflow/rocm/158747 2025-09-07T07:46:33.8890336Z * [new tag] ciflow/rocm/159146 -> ciflow/rocm/159146 2025-09-07T07:46:33.8890513Z * [new tag] ciflow/rocm/159158 -> ciflow/rocm/159158 2025-09-07T07:46:33.8890753Z * [new tag] ciflow/rocm/161715 -> ciflow/rocm/161715 2025-09-07T07:46:33.8890929Z * [new tag] ciflow/rocm/161972 -> ciflow/rocm/161972 2025-09-07T07:46:33.8891092Z * [new tag] ciflow/rocm/162052 -> ciflow/rocm/162052 2025-09-07T07:46:33.8891266Z * [new tag] ciflow/rocm/162053 -> ciflow/rocm/162053 2025-09-07T07:46:33.8891428Z * [new tag] ciflow/rocm/162056 -> ciflow/rocm/162056 2025-09-07T07:46:33.8891589Z * [new tag] ciflow/rocm/162112 -> ciflow/rocm/162112 2025-09-07T07:46:33.8891766Z * [new tag] ciflow/rocm/162278 -> ciflow/rocm/162278 2025-09-07T07:46:33.8891930Z * [new tag] ciflow/rocm/162288 -> ciflow/rocm/162288 2025-09-07T07:46:33.8892104Z * [new tag] ciflow/rocm/162305 -> ciflow/rocm/162305 2025-09-07T07:46:33.8892275Z * [new tag] ciflow/slow/01c7106 -> ciflow/slow/01c7106 2025-09-07T07:46:33.8892443Z * [new tag] ciflow/slow/0577043 -> ciflow/slow/0577043 2025-09-07T07:46:33.8892988Z * [new tag] ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym -> ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym 2025-09-07T07:46:33.8893156Z * [new tag] ciflow/slow/0e81104 -> ciflow/slow/0e81104 2025-09-07T07:46:33.8893329Z * [new tag] ciflow/slow/161395 -> ciflow/slow/161395 2025-09-07T07:46:33.8893495Z * [new tag] ciflow/slow/1732077 -> ciflow/slow/1732077 2025-09-07T07:46:33.8893673Z * [new tag] ciflow/slow/187eb7c -> ciflow/slow/187eb7c 2025-09-07T07:46:33.8893840Z * [new tag] ciflow/slow/1faef89 -> ciflow/slow/1faef89 2025-09-07T07:46:33.8894107Z * [new tag] ciflow/slow/3920ec1 -> ciflow/slow/3920ec1 2025-09-07T07:46:33.8894287Z * [new tag] ciflow/slow/3b7c6b2 -> ciflow/slow/3b7c6b2 2025-09-07T07:46:33.8894453Z * [new tag] ciflow/slow/59a3759 -> ciflow/slow/59a3759 2025-09-07T07:46:33.8894633Z * [new tag] ciflow/slow/70ef0bb -> ciflow/slow/70ef0bb 2025-09-07T07:46:33.8894799Z * [new tag] ciflow/slow/788ff06 -> ciflow/slow/788ff06 2025-09-07T07:46:33.8895287Z * [new tag] ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym -> ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym 2025-09-07T07:46:33.8895466Z * [new tag] ciflow/slow/9d85864 -> ciflow/slow/9d85864 2025-09-07T07:46:33.8895635Z * [new tag] ciflow/slow/9ffad5b -> ciflow/slow/9ffad5b 2025-09-07T07:46:33.8895814Z * [new tag] ciflow/slow/a206e8b -> ciflow/slow/a206e8b 2025-09-07T07:46:33.8895986Z * [new tag] ciflow/slow/a837609 -> ciflow/slow/a837609 2025-09-07T07:46:33.8896164Z * [new tag] ciflow/slow/af841f3 -> ciflow/slow/af841f3 2025-09-07T07:46:33.8896682Z * [new tag] ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym -> ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym 2025-09-07T07:46:33.8896908Z * [new tag] ciflow/triton_binaries/162329 -> ciflow/triton_binaries/162329 2025-09-07T07:46:33.8897085Z * [new tag] ciflow/trunk/113258 -> ciflow/trunk/113258 2025-09-07T07:46:33.8897249Z * [new tag] ciflow/trunk/137400 -> ciflow/trunk/137400 2025-09-07T07:46:33.8897510Z * [new tag] ciflow/trunk/148180 -> ciflow/trunk/148180 2025-09-07T07:46:33.8897680Z * [new tag] ciflow/trunk/148328 -> ciflow/trunk/148328 2025-09-07T07:46:33.8897862Z * [new tag] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-09-07T07:46:33.8898030Z * [new tag] ciflow/trunk/148919 -> ciflow/trunk/148919 2025-09-07T07:46:33.8898291Z * [new tag] ciflow/trunk/152624 -> ciflow/trunk/152624 2025-09-07T07:46:33.8898470Z * [new tag] ciflow/trunk/154170 -> ciflow/trunk/154170 2025-09-07T07:46:33.8898636Z * [new tag] ciflow/trunk/154694 -> ciflow/trunk/154694 2025-09-07T07:46:33.8898810Z * [new tag] ciflow/trunk/156049 -> ciflow/trunk/156049 2025-09-07T07:46:33.8898976Z * [new tag] ciflow/trunk/156703 -> ciflow/trunk/156703 2025-09-07T07:46:33.8899142Z * [new tag] ciflow/trunk/156711 -> ciflow/trunk/156711 2025-09-07T07:46:33.8899318Z * [new tag] ciflow/trunk/157432 -> ciflow/trunk/157432 2025-09-07T07:46:33.8899483Z * [new tag] ciflow/trunk/157685 -> ciflow/trunk/157685 2025-09-07T07:46:33.8899668Z * [new tag] ciflow/trunk/157689 -> ciflow/trunk/157689 2025-09-07T07:46:33.8899838Z * [new tag] ciflow/trunk/157699 -> ciflow/trunk/157699 2025-09-07T07:46:33.8900015Z * [new tag] ciflow/trunk/157813 -> ciflow/trunk/157813 2025-09-07T07:46:33.8900182Z * [new tag] ciflow/trunk/157994 -> ciflow/trunk/157994 2025-09-07T07:46:33.8900348Z * [new tag] ciflow/trunk/158091 -> ciflow/trunk/158091 2025-09-07T07:46:33.8900523Z * [new tag] ciflow/trunk/158104 -> ciflow/trunk/158104 2025-09-07T07:46:33.8900689Z * [new tag] ciflow/trunk/158404 -> ciflow/trunk/158404 2025-09-07T07:46:33.8900866Z * [new tag] ciflow/trunk/158647 -> ciflow/trunk/158647 2025-09-07T07:46:33.8901032Z * [new tag] ciflow/trunk/158846 -> ciflow/trunk/158846 2025-09-07T07:46:33.8901281Z * [new tag] ciflow/trunk/159158 -> ciflow/trunk/159158 2025-09-07T07:46:33.8901456Z * [new tag] ciflow/trunk/159682 -> ciflow/trunk/159682 2025-09-07T07:46:33.8901625Z * [new tag] ciflow/trunk/159835 -> ciflow/trunk/159835 2025-09-07T07:46:33.8901800Z * [new tag] ciflow/trunk/160161 -> ciflow/trunk/160161 2025-09-07T07:46:33.8901967Z * [new tag] ciflow/trunk/160236 -> ciflow/trunk/160236 2025-09-07T07:46:33.8902134Z * [new tag] ciflow/trunk/160329 -> ciflow/trunk/160329 2025-09-07T07:46:33.8902314Z * [new tag] ciflow/trunk/160480 -> ciflow/trunk/160480 2025-09-07T07:46:33.8902479Z * [new tag] ciflow/trunk/160532 -> ciflow/trunk/160532 2025-09-07T07:46:33.8902655Z * [new tag] ciflow/trunk/160836 -> ciflow/trunk/160836 2025-09-07T07:46:33.8902825Z * [new tag] ciflow/trunk/160843 -> ciflow/trunk/160843 2025-09-07T07:46:33.8903000Z * [new tag] ciflow/trunk/160869 -> ciflow/trunk/160869 2025-09-07T07:46:33.8903171Z * [new tag] ciflow/trunk/160928 -> ciflow/trunk/160928 2025-09-07T07:46:33.8903338Z * [new tag] ciflow/trunk/160940 -> ciflow/trunk/160940 2025-09-07T07:46:33.8903518Z * [new tag] ciflow/trunk/160943 -> ciflow/trunk/160943 2025-09-07T07:46:33.8903682Z * [new tag] ciflow/trunk/160953 -> ciflow/trunk/160953 2025-09-07T07:46:33.8903862Z * [new tag] ciflow/trunk/161035 -> ciflow/trunk/161035 2025-09-07T07:46:33.8904026Z * [new tag] ciflow/trunk/161178 -> ciflow/trunk/161178 2025-09-07T07:46:33.8904191Z * [new tag] ciflow/trunk/161349 -> ciflow/trunk/161349 2025-09-07T07:46:33.8904368Z * [new tag] ciflow/trunk/161350 -> ciflow/trunk/161350 2025-09-07T07:46:33.8904535Z * [new tag] ciflow/trunk/161351 -> ciflow/trunk/161351 2025-09-07T07:46:33.8904796Z * [new tag] ciflow/trunk/161395 -> ciflow/trunk/161395 2025-09-07T07:46:33.8904963Z * [new tag] ciflow/trunk/161405 -> ciflow/trunk/161405 2025-09-07T07:46:33.8905140Z * [new tag] ciflow/trunk/161406 -> ciflow/trunk/161406 2025-09-07T07:46:33.8905305Z * [new tag] ciflow/trunk/161410 -> ciflow/trunk/161410 2025-09-07T07:46:33.8905473Z * [new tag] ciflow/trunk/161468 -> ciflow/trunk/161468 2025-09-07T07:46:33.8905649Z * [new tag] ciflow/trunk/161499 -> ciflow/trunk/161499 2025-09-07T07:46:33.8905814Z * [new tag] ciflow/trunk/161527 -> ciflow/trunk/161527 2025-09-07T07:46:33.8905990Z * [new tag] ciflow/trunk/161534 -> ciflow/trunk/161534 2025-09-07T07:46:33.8906161Z * [new tag] ciflow/trunk/161591 -> ciflow/trunk/161591 2025-09-07T07:46:33.8906325Z * [new tag] ciflow/trunk/161595 -> ciflow/trunk/161595 2025-09-07T07:46:33.8906508Z * [new tag] ciflow/trunk/161596 -> ciflow/trunk/161596 2025-09-07T07:46:33.8906677Z * [new tag] ciflow/trunk/161633 -> ciflow/trunk/161633 2025-09-07T07:46:33.8906857Z * [new tag] ciflow/trunk/161634 -> ciflow/trunk/161634 2025-09-07T07:46:33.8907021Z * [new tag] ciflow/trunk/161635 -> ciflow/trunk/161635 2025-09-07T07:46:33.8907196Z * [new tag] ciflow/trunk/161667 -> ciflow/trunk/161667 2025-09-07T07:46:33.8907360Z * [new tag] ciflow/trunk/161670 -> ciflow/trunk/161670 2025-09-07T07:46:33.8907524Z * [new tag] ciflow/trunk/161692 -> ciflow/trunk/161692 2025-09-07T07:46:33.8907698Z * [new tag] ciflow/trunk/161693 -> ciflow/trunk/161693 2025-09-07T07:46:33.8907954Z * [new tag] ciflow/trunk/161695 -> ciflow/trunk/161695 2025-09-07T07:46:33.8908132Z * [new tag] ciflow/trunk/161730 -> ciflow/trunk/161730 2025-09-07T07:46:33.8908298Z * [new tag] ciflow/trunk/161744 -> ciflow/trunk/161744 2025-09-07T07:46:33.8908463Z * [new tag] ciflow/trunk/161749 -> ciflow/trunk/161749 2025-09-07T07:46:33.8908643Z * [new tag] ciflow/trunk/161881 -> ciflow/trunk/161881 2025-09-07T07:46:33.8908809Z * [new tag] ciflow/trunk/161924 -> ciflow/trunk/161924 2025-09-07T07:46:33.8908986Z * [new tag] ciflow/trunk/161926 -> ciflow/trunk/161926 2025-09-07T07:46:33.8909149Z * [new tag] ciflow/trunk/161936 -> ciflow/trunk/161936 2025-09-07T07:46:33.8909323Z * [new tag] ciflow/trunk/161952 -> ciflow/trunk/161952 2025-09-07T07:46:33.8909491Z * [new tag] ciflow/trunk/161955 -> ciflow/trunk/161955 2025-09-07T07:46:33.8909656Z * [new tag] ciflow/trunk/161957 -> ciflow/trunk/161957 2025-09-07T07:46:33.8909838Z * [new tag] ciflow/trunk/161959 -> ciflow/trunk/161959 2025-09-07T07:46:33.8910003Z * [new tag] ciflow/trunk/161977 -> ciflow/trunk/161977 2025-09-07T07:46:33.8910176Z * [new tag] ciflow/trunk/161988 -> ciflow/trunk/161988 2025-09-07T07:46:33.8910339Z * [new tag] ciflow/trunk/161994 -> ciflow/trunk/161994 2025-09-07T07:46:33.8910505Z * [new tag] ciflow/trunk/162007 -> ciflow/trunk/162007 2025-09-07T07:46:33.8910680Z * [new tag] ciflow/trunk/162013 -> ciflow/trunk/162013 2025-09-07T07:46:33.8910845Z * [new tag] ciflow/trunk/162017 -> ciflow/trunk/162017 2025-09-07T07:46:33.8911028Z * [new tag] ciflow/trunk/162021 -> ciflow/trunk/162021 2025-09-07T07:46:33.8911191Z * [new tag] ciflow/trunk/162022 -> ciflow/trunk/162022 2025-09-07T07:46:33.8911450Z * [new tag] ciflow/trunk/162040 -> ciflow/trunk/162040 2025-09-07T07:46:33.8911618Z * [new tag] ciflow/trunk/162041 -> ciflow/trunk/162041 2025-09-07T07:46:33.8911782Z * [new tag] ciflow/trunk/162062 -> ciflow/trunk/162062 2025-09-07T07:46:33.8911957Z * [new tag] ciflow/trunk/162066 -> ciflow/trunk/162066 2025-09-07T07:46:33.8912122Z * [new tag] ciflow/trunk/162089 -> ciflow/trunk/162089 2025-09-07T07:46:33.8912297Z * [new tag] ciflow/trunk/162099 -> ciflow/trunk/162099 2025-09-07T07:46:33.8912460Z * [new tag] ciflow/trunk/162104 -> ciflow/trunk/162104 2025-09-07T07:46:33.8912624Z * [new tag] ciflow/trunk/162106 -> ciflow/trunk/162106 2025-09-07T07:46:33.8912803Z * [new tag] ciflow/trunk/162112 -> ciflow/trunk/162112 2025-09-07T07:46:33.8912973Z * [new tag] ciflow/trunk/162119 -> ciflow/trunk/162119 2025-09-07T07:46:33.8913149Z * [new tag] ciflow/trunk/162142 -> ciflow/trunk/162142 2025-09-07T07:46:33.8913312Z * [new tag] ciflow/trunk/162169 -> ciflow/trunk/162169 2025-09-07T07:46:33.8913488Z * [new tag] ciflow/trunk/162183 -> ciflow/trunk/162183 2025-09-07T07:46:33.8913655Z * [new tag] ciflow/trunk/162190 -> ciflow/trunk/162190 2025-09-07T07:46:33.8913821Z * [new tag] ciflow/trunk/162194 -> ciflow/trunk/162194 2025-09-07T07:46:33.8913999Z * [new tag] ciflow/trunk/162200 -> ciflow/trunk/162200 2025-09-07T07:46:33.8914164Z * [new tag] ciflow/trunk/162206 -> ciflow/trunk/162206 2025-09-07T07:46:33.8914434Z * [new tag] ciflow/trunk/162208 -> ciflow/trunk/162208 2025-09-07T07:46:33.8914599Z * [new tag] ciflow/trunk/162222 -> ciflow/trunk/162222 2025-09-07T07:46:33.8914768Z * [new tag] ciflow/trunk/162238 -> ciflow/trunk/162238 2025-09-07T07:46:33.8914945Z * [new tag] ciflow/trunk/162244 -> ciflow/trunk/162244 2025-09-07T07:46:33.8915113Z * [new tag] ciflow/trunk/162267 -> ciflow/trunk/162267 2025-09-07T07:46:33.8915289Z * [new tag] ciflow/trunk/162269 -> ciflow/trunk/162269 2025-09-07T07:46:33.8915455Z * [new tag] ciflow/trunk/162278 -> ciflow/trunk/162278 2025-09-07T07:46:33.8915621Z * [new tag] ciflow/trunk/162286 -> ciflow/trunk/162286 2025-09-07T07:46:33.8915796Z * [new tag] ciflow/trunk/162288 -> ciflow/trunk/162288 2025-09-07T07:46:33.8915968Z * [new tag] ciflow/trunk/162293 -> ciflow/trunk/162293 2025-09-07T07:46:33.8916151Z * [new tag] ciflow/trunk/162310 -> ciflow/trunk/162310 2025-09-07T07:46:33.8916321Z * [new tag] ciflow/trunk/162311 -> ciflow/trunk/162311 2025-09-07T07:46:33.8916504Z * [new tag] ciflow/trunk/162315 -> ciflow/trunk/162315 2025-09-07T07:46:33.8916668Z * [new tag] ciflow/trunk/162325 -> ciflow/trunk/162325 2025-09-07T07:46:33.8916835Z * [new tag] ciflow/trunk/162328 -> ciflow/trunk/162328 2025-09-07T07:46:33.8917008Z * [new tag] ciflow/trunk/162329 -> ciflow/trunk/162329 2025-09-07T07:46:33.8917179Z * [new tag] ciflow/unstable/123 -> ciflow/unstable/123 2025-09-07T07:46:33.8917350Z * [new tag] ciflow/vllm/162292 -> ciflow/vllm/162292 2025-09-07T07:46:33.8917536Z * [new tag] ciflow/win-arm64/156049 -> ciflow/win-arm64/156049 2025-09-07T07:46:33.8917724Z * [new tag] ciflow/win-arm64/158104 -> ciflow/win-arm64/158104 2025-09-07T07:46:33.8917992Z * [new tag] ciflow/xpu/157699 -> ciflow/xpu/157699 2025-09-07T07:46:33.8918158Z * [new tag] ciflow/xpu/157994 -> ciflow/xpu/157994 2025-09-07T07:46:33.8918331Z * [new tag] ciflow/xpu/159459 -> ciflow/xpu/159459 2025-09-07T07:46:33.8918491Z * [new tag] ciflow/xpu/159718 -> ciflow/xpu/159718 2025-09-07T07:46:33.8918661Z * [new tag] ciflow/xpu/159944 -> ciflow/xpu/159944 2025-09-07T07:46:33.8918820Z * [new tag] ciflow/xpu/160867 -> ciflow/xpu/160867 2025-09-07T07:46:33.8918978Z * [new tag] ciflow/xpu/160938 -> ciflow/xpu/160938 2025-09-07T07:46:33.8919147Z * [new tag] ciflow/xpu/160940 -> ciflow/xpu/160940 2025-09-07T07:46:33.8919311Z * [new tag] ciflow/xpu/160953 -> ciflow/xpu/160953 2025-09-07T07:46:33.8919483Z * [new tag] ciflow/xpu/161045 -> ciflow/xpu/161045 2025-09-07T07:46:33.8919644Z * [new tag] ciflow/xpu/161058 -> ciflow/xpu/161058 2025-09-07T07:46:33.8919816Z * [new tag] ciflow/xpu/161246 -> ciflow/xpu/161246 2025-09-07T07:46:33.8919975Z * [new tag] ciflow/xpu/161397 -> ciflow/xpu/161397 2025-09-07T07:46:33.8920136Z * [new tag] ciflow/xpu/161485 -> ciflow/xpu/161485 2025-09-07T07:46:33.8920307Z * [new tag] ciflow/xpu/161988 -> ciflow/xpu/161988 2025-09-07T07:46:33.8920467Z * [new tag] ciflow/xpu/162062 -> ciflow/xpu/162062 2025-09-07T07:46:33.8920625Z * [new tag] cslpull75 -> cslpull75 2025-09-07T07:46:33.8920766Z * [new tag] cslpull76 -> cslpull76 2025-09-07T07:46:33.8920990Z * [new tag] cslpull77 -> cslpull77 2025-09-07T07:46:33.8921137Z * [new tag] cslpull78 -> cslpull78 2025-09-07T07:46:33.8921280Z * [new tag] cslpull79 -> cslpull79 2025-09-07T07:46:33.8921431Z * [new tag] cslpull80 -> cslpull80 2025-09-07T07:46:33.8921569Z * [new tag] cslpull81 -> cslpull81 2025-09-07T07:46:33.8921723Z * [new tag] cslpull82 -> cslpull82 2025-09-07T07:46:33.8921861Z * [new tag] cslpull83 -> cslpull83 2025-09-07T07:46:33.8922000Z * [new tag] cslpull84 -> cslpull84 2025-09-07T07:46:33.8922151Z * [new tag] cslpull85 -> cslpull85 2025-09-07T07:46:33.8922289Z * [new tag] cslpull86 -> cslpull86 2025-09-07T07:46:33.8922445Z * [new tag] cslpull87 -> cslpull87 2025-09-07T07:46:33.8922582Z * [new tag] cslpull88 -> cslpull88 2025-09-07T07:46:33.8922728Z * [new tag] cslpull89 -> cslpull89 2025-09-07T07:46:33.8923015Z * [new tag] cslpull90 -> cslpull90 2025-09-07T07:46:33.8923158Z * [new tag] cslpull91 -> cslpull91 2025-09-07T07:46:33.8923314Z * [new tag] cslpull92 -> cslpull92 2025-09-07T07:46:33.8923455Z * [new tag] flight_5 -> flight_5 2025-09-07T07:46:33.8923602Z * [new tag] flight_5.1 -> flight_5.1 2025-09-07T07:46:33.8923755Z * [new tag] flight_5.2 -> flight_5.2 2025-09-07T07:46:33.8923899Z * [new tag] flight_5.3 -> flight_5.3 2025-09-07T07:46:33.8924054Z * [new tag] forpull1 -> forpull1 2025-09-07T07:46:33.8924229Z * [new tag] malfet/tag-2ef5611 -> malfet/tag-2ef5611 2025-09-07T07:46:33.8924511Z * [new tag] malfet/tag-317b1a0 -> malfet/tag-317b1a0 2025-09-07T07:46:33.8924680Z * [new tag] malfet/tag-ec6f767 -> malfet/tag-ec6f767 2025-09-07T07:46:33.8924845Z * [new tag] nightly-binary -> nightly-binary 2025-09-07T07:46:33.8925031Z * [new tag] sqzhang_flight4_plus -> sqzhang_flight4_plus 2025-09-07T07:46:33.8925195Z * [new tag] sqzhang_flight_3 -> sqzhang_flight_3 2025-09-07T07:46:33.8925588Z * [new tag] trunk/00636e0171e7e733628c408084805442270cf608 -> trunk/00636e0171e7e733628c408084805442270cf608 2025-09-07T07:46:33.8925972Z * [new tag] trunk/019fed39aa6b2dd8c69347378d53423e5efae8d4 -> trunk/019fed39aa6b2dd8c69347378d53423e5efae8d4 2025-09-07T07:46:33.8926362Z * [new tag] trunk/01ab325cc2e0dc221af4d710974e1b9175066544 -> trunk/01ab325cc2e0dc221af4d710974e1b9175066544 2025-09-07T07:46:33.8926779Z * [new tag] trunk/01edcd4df8bf0c7b4cc2d3ec868bd2059eeea83b -> trunk/01edcd4df8bf0c7b4cc2d3ec868bd2059eeea83b 2025-09-07T07:46:33.8927161Z * [new tag] trunk/040d00af048967dde7938d358d7f5988cbd18388 -> trunk/040d00af048967dde7938d358d7f5988cbd18388 2025-09-07T07:46:33.8927558Z * [new tag] trunk/0447f2d99b4351b2ff129dce6eebb371024f73e5 -> trunk/0447f2d99b4351b2ff129dce6eebb371024f73e5 2025-09-07T07:46:33.8927928Z * [new tag] trunk/047603d35bdc70046216384838d6340feab79bf4 -> trunk/047603d35bdc70046216384838d6340feab79bf4 2025-09-07T07:46:33.8928327Z * [new tag] trunk/06da7c0730b3764f178ec3a90dedf4ffa4202d81 -> trunk/06da7c0730b3764f178ec3a90dedf4ffa4202d81 2025-09-07T07:46:33.8928702Z * [new tag] trunk/081cab045472ce045634548cc6c14a4870641e23 -> trunk/081cab045472ce045634548cc6c14a4870641e23 2025-09-07T07:46:33.8929192Z * [new tag] trunk/09587daf8c9f21f5340f73921ce5f23d1a4a4572 -> trunk/09587daf8c9f21f5340f73921ce5f23d1a4a4572 2025-09-07T07:46:33.8929575Z * [new tag] trunk/09be1890d72cc34fc946965dc4a27736bf0ca8c6 -> trunk/09be1890d72cc34fc946965dc4a27736bf0ca8c6 2025-09-07T07:46:33.8929955Z * [new tag] trunk/09d2f1b6315d6d416fbf452793d65795863ebc66 -> trunk/09d2f1b6315d6d416fbf452793d65795863ebc66 2025-09-07T07:46:33.8930358Z * [new tag] trunk/0af70e2353e1dcda83175fd4834ecb7b63e009e0 -> trunk/0af70e2353e1dcda83175fd4834ecb7b63e009e0 2025-09-07T07:46:33.8930737Z * [new tag] trunk/0c0e056a9e20c17271a6144dd32c0c7e3ba26736 -> trunk/0c0e056a9e20c17271a6144dd32c0c7e3ba26736 2025-09-07T07:46:33.8931140Z * [new tag] trunk/0cd6c56bdfa9178ff61be82ce3b178926ddb64a9 -> trunk/0cd6c56bdfa9178ff61be82ce3b178926ddb64a9 2025-09-07T07:46:33.8931533Z * [new tag] trunk/0d421ace32c1605ee8e452ee1eeb03bd243dd96c -> trunk/0d421ace32c1605ee8e452ee1eeb03bd243dd96c 2025-09-07T07:46:33.8931943Z * [new tag] trunk/0d71a9dd5b4b6d1dde58d91c9b71d96bc6a6a171 -> trunk/0d71a9dd5b4b6d1dde58d91c9b71d96bc6a6a171 2025-09-07T07:46:33.8932337Z * [new tag] trunk/0d84ff3b78f55492d3d4708458c92d776274939e -> trunk/0d84ff3b78f55492d3d4708458c92d776274939e 2025-09-07T07:46:33.8932727Z * [new tag] trunk/0f45aaf4414048b17d720d0915ce221a8de8ec63 -> trunk/0f45aaf4414048b17d720d0915ce221a8de8ec63 2025-09-07T07:46:33.8933127Z * [new tag] trunk/0ff8eabf1387de5acd6712a03bda61f1a3dfa27f -> trunk/0ff8eabf1387de5acd6712a03bda61f1a3dfa27f 2025-09-07T07:46:33.8933501Z * [new tag] trunk/104f2680e03d13a4765ca69f905d8f16fc0c822f -> trunk/104f2680e03d13a4765ca69f905d8f16fc0c822f 2025-09-07T07:46:33.8933890Z * [new tag] trunk/12814701555d3e41dfcdf8f9273af5821e322df0 -> trunk/12814701555d3e41dfcdf8f9273af5821e322df0 2025-09-07T07:46:33.8934275Z * [new tag] trunk/13b65196db422bdb394cb482e208c61ed448898c -> trunk/13b65196db422bdb394cb482e208c61ed448898c 2025-09-07T07:46:33.8934764Z * [new tag] trunk/13d66e2a66eceed14b8a8f5a971087df4f688a46 -> trunk/13d66e2a66eceed14b8a8f5a971087df4f688a46 2025-09-07T07:46:33.8935155Z * [new tag] trunk/145a3a7bda15e3963a33eb1b54bba5d4a270b225 -> trunk/145a3a7bda15e3963a33eb1b54bba5d4a270b225 2025-09-07T07:46:33.8935541Z * [new tag] trunk/146371483318e17929daefd37c8e459d9d6d47bb -> trunk/146371483318e17929daefd37c8e459d9d6d47bb 2025-09-07T07:46:33.8935933Z * [new tag] trunk/15c77a8cfd341e74fd124b077492ef2bfa51b339 -> trunk/15c77a8cfd341e74fd124b077492ef2bfa51b339 2025-09-07T07:46:33.8936319Z * [new tag] trunk/17fa8eec4a1e32939ab4d364ee6e75487a79b654 -> trunk/17fa8eec4a1e32939ab4d364ee6e75487a79b654 2025-09-07T07:46:33.8936708Z * [new tag] trunk/190c391a28845a14df26abb228d26aa813efb20c -> trunk/190c391a28845a14df26abb228d26aa813efb20c 2025-09-07T07:46:33.8937099Z * [new tag] trunk/1a588ace4667bde1331fbd8ed957157dca5cee68 -> trunk/1a588ace4667bde1331fbd8ed957157dca5cee68 2025-09-07T07:46:33.8937588Z * [new tag] trunk/1aa7476885e8f6e7b0ec3a5b6383aad9d3f343e7 -> trunk/1aa7476885e8f6e7b0ec3a5b6383aad9d3f343e7 2025-09-07T07:46:33.8937978Z * [new tag] trunk/1aeb421c342c9e9607842f4c87cb46e8e816ee53 -> trunk/1aeb421c342c9e9607842f4c87cb46e8e816ee53 2025-09-07T07:46:33.8938375Z * [new tag] trunk/1c1b28d5b6a942fafe23b2f09302d93c25226d4a -> trunk/1c1b28d5b6a942fafe23b2f09302d93c25226d4a 2025-09-07T07:46:33.8938769Z * [new tag] trunk/1ebd70d0c0d562d3be9abdee2a21906584af7d99 -> trunk/1ebd70d0c0d562d3be9abdee2a21906584af7d99 2025-09-07T07:46:33.8939171Z * [new tag] trunk/1ec2c15914da4ef7bd926ed9aebc8671c75fe965 -> trunk/1ec2c15914da4ef7bd926ed9aebc8671c75fe965 2025-09-07T07:46:33.8939553Z * [new tag] trunk/1f51056bd64e73d1aa81321bc3c098575b1bc78a -> trunk/1f51056bd64e73d1aa81321bc3c098575b1bc78a 2025-09-07T07:46:33.8940033Z * [new tag] trunk/1f820de639c75a1562d3fb03f160439f853ae07b -> trunk/1f820de639c75a1562d3fb03f160439f853ae07b 2025-09-07T07:46:33.8940419Z * [new tag] trunk/204697f0e695d82894c5010fbec664c4391f90cc -> trunk/204697f0e695d82894c5010fbec664c4391f90cc 2025-09-07T07:46:33.8940796Z * [new tag] trunk/20629b1619fe636227d01fc85ba221daa7185a05 -> trunk/20629b1619fe636227d01fc85ba221daa7185a05 2025-09-07T07:46:33.8941193Z * [new tag] trunk/20b47acef845e9c4f71da9429a396d293f50ebe7 -> trunk/20b47acef845e9c4f71da9429a396d293f50ebe7 2025-09-07T07:46:33.8941577Z * [new tag] trunk/20bfb2539d7c5250379648eda35f80b8a7d642dd -> trunk/20bfb2539d7c5250379648eda35f80b8a7d642dd 2025-09-07T07:46:33.8941968Z * [new tag] trunk/21fae99c180d17def562797ea0fb154d8fdf88e3 -> trunk/21fae99c180d17def562797ea0fb154d8fdf88e3 2025-09-07T07:46:33.8942360Z * [new tag] trunk/248355faf53f9f7ba2fd0a367d59600c6d991e7f -> trunk/248355faf53f9f7ba2fd0a367d59600c6d991e7f 2025-09-07T07:46:33.8942750Z * [new tag] trunk/25f4aaed9ec26f39c13862323ff8582006473d23 -> trunk/25f4aaed9ec26f39c13862323ff8582006473d23 2025-09-07T07:46:33.8943132Z * [new tag] trunk/261a84a1764412f8e659c956e3f81997ec3de9d5 -> trunk/261a84a1764412f8e659c956e3f81997ec3de9d5 2025-09-07T07:46:33.8943508Z * [new tag] trunk/28f4ab0737937858730f29f5c4e601e109cf9d5f -> trunk/28f4ab0737937858730f29f5c4e601e109cf9d5f 2025-09-07T07:46:33.8943904Z * [new tag] trunk/291cd11f2d5df6f48d348cce0e4e762f274f4dc4 -> trunk/291cd11f2d5df6f48d348cce0e4e762f274f4dc4 2025-09-07T07:46:33.8944278Z * [new tag] trunk/29280864d941e6108ab57f7298f520c0cf9696e9 -> trunk/29280864d941e6108ab57f7298f520c0cf9696e9 2025-09-07T07:46:33.8944672Z * [new tag] trunk/2a45837e98c63cae9d1a2e2133a727b829e549d5 -> trunk/2a45837e98c63cae9d1a2e2133a727b829e549d5 2025-09-07T07:46:33.8945066Z * [new tag] trunk/2a5c0785e2f975697fd7bdf1411de6e03dcaa1ef -> trunk/2a5c0785e2f975697fd7bdf1411de6e03dcaa1ef 2025-09-07T07:46:33.8945544Z * [new tag] trunk/2b8a83901c58a0858ea9e4ce00055f48e6ed164c -> trunk/2b8a83901c58a0858ea9e4ce00055f48e6ed164c 2025-09-07T07:46:33.8945923Z * [new tag] trunk/2ba65472dd54488a86a50326ea990195fc6732d6 -> trunk/2ba65472dd54488a86a50326ea990195fc6732d6 2025-09-07T07:46:33.8946314Z * [new tag] trunk/2c03f0acc53ed13fe8ebfe809129f25996e009a0 -> trunk/2c03f0acc53ed13fe8ebfe809129f25996e009a0 2025-09-07T07:46:33.8946699Z * [new tag] trunk/2dd529df0092799f68ee7afcf52338276906706a -> trunk/2dd529df0092799f68ee7afcf52338276906706a 2025-09-07T07:46:33.8947093Z * [new tag] trunk/2f6b4b1ad3f82bb3bd984f6e65744ea339ffb8b5 -> trunk/2f6b4b1ad3f82bb3bd984f6e65744ea339ffb8b5 2025-09-07T07:46:33.8947490Z * [new tag] trunk/2fa0520a64ed8aa734a56c4d124958f0b5711ca8 -> trunk/2fa0520a64ed8aa734a56c4d124958f0b5711ca8 2025-09-07T07:46:33.8947881Z * [new tag] trunk/302df2ac5dc4222294c09d48804a2dddb8f4bad8 -> trunk/302df2ac5dc4222294c09d48804a2dddb8f4bad8 2025-09-07T07:46:33.8948273Z * [new tag] trunk/33028597bfa2e0178e28c8cce33cb9b3800cac43 -> trunk/33028597bfa2e0178e28c8cce33cb9b3800cac43 2025-09-07T07:46:33.8948647Z * [new tag] trunk/34aa78274d6770086025a967fa63a86830e08176 -> trunk/34aa78274d6770086025a967fa63a86830e08176 2025-09-07T07:46:33.8949042Z * [new tag] trunk/3559c354ce6a14d11fe29fb12fa2747a2f2af449 -> trunk/3559c354ce6a14d11fe29fb12fa2747a2f2af449 2025-09-07T07:46:33.8949430Z * [new tag] trunk/36d207fcaaede0d1e58a5168084c307b32b6fd8b -> trunk/36d207fcaaede0d1e58a5168084c307b32b6fd8b 2025-09-07T07:46:33.8949811Z * [new tag] trunk/377033757ae5ca524ea842f1b0a5f446ed3d8fe0 -> trunk/377033757ae5ca524ea842f1b0a5f446ed3d8fe0 2025-09-07T07:46:33.8950287Z * [new tag] trunk/3771380f83fcac154a7c89ad679311d8c4818287 -> trunk/3771380f83fcac154a7c89ad679311d8c4818287 2025-09-07T07:46:33.8950667Z * [new tag] trunk/3a207816cc569f78863d86c01f2a3d265350e39f -> trunk/3a207816cc569f78863d86c01f2a3d265350e39f 2025-09-07T07:46:33.8951059Z * [new tag] trunk/3a20a20e7065ec927fdd216d4da3b04f879b3c67 -> trunk/3a20a20e7065ec927fdd216d4da3b04f879b3c67 2025-09-07T07:46:33.8951452Z * [new tag] trunk/3bbc2e3e4f025523eaa5dbff220b3e96bca608d0 -> trunk/3bbc2e3e4f025523eaa5dbff220b3e96bca608d0 2025-09-07T07:46:33.8951853Z * [new tag] trunk/3c0ff1b569c45cfa6935ad8031a9d4cf1551aa3f -> trunk/3c0ff1b569c45cfa6935ad8031a9d4cf1551aa3f 2025-09-07T07:46:33.8952240Z * [new tag] trunk/3c45af079afc92a03b03ddf4f9198902ffcf30cf -> trunk/3c45af079afc92a03b03ddf4f9198902ffcf30cf 2025-09-07T07:46:33.8952633Z * [new tag] trunk/3dde5d7f9bf80dd6623a712bc429e9e4302464b5 -> trunk/3dde5d7f9bf80dd6623a712bc429e9e4302464b5 2025-09-07T07:46:33.8953035Z * [new tag] trunk/403a3a393cda7e60f503f3b04b8805a845dcf45d -> trunk/403a3a393cda7e60f503f3b04b8805a845dcf45d 2025-09-07T07:46:33.8953423Z * [new tag] trunk/420c52ecf36f86d32da0853bfbe074b682b070aa -> trunk/420c52ecf36f86d32da0853bfbe074b682b070aa 2025-09-07T07:46:33.8953819Z * [new tag] trunk/43b7c86a2c0f91320f5c5f4827b111edff06fdb6 -> trunk/43b7c86a2c0f91320f5c5f4827b111edff06fdb6 2025-09-07T07:46:33.8954196Z * [new tag] trunk/451ed931562ec8b46d1f7e6c266a68132a119336 -> trunk/451ed931562ec8b46d1f7e6c266a68132a119336 2025-09-07T07:46:33.8954582Z * [new tag] trunk/480c7391126656154318fabf1d57ebc01e196e63 -> trunk/480c7391126656154318fabf1d57ebc01e196e63 2025-09-07T07:46:33.8954968Z * [new tag] trunk/48bedd753da22634aa94fbafeb731e82025404f3 -> trunk/48bedd753da22634aa94fbafeb731e82025404f3 2025-09-07T07:46:33.8955364Z * [new tag] trunk/494878a11b79071ada0b98f34042d47155be6d1c -> trunk/494878a11b79071ada0b98f34042d47155be6d1c 2025-09-07T07:46:33.8955851Z * [new tag] trunk/4ae57d448c0a7d37e4cfd5c27d977fad2cef4051 -> trunk/4ae57d448c0a7d37e4cfd5c27d977fad2cef4051 2025-09-07T07:46:33.8956240Z * [new tag] trunk/4cdaf8265d86f984254b62052da8c26ef61ef1cf -> trunk/4cdaf8265d86f984254b62052da8c26ef61ef1cf 2025-09-07T07:46:33.8956647Z * [new tag] trunk/4d4abec80f03cd8fdefe1d9cb3a60d3690cd777e -> trunk/4d4abec80f03cd8fdefe1d9cb3a60d3690cd777e 2025-09-07T07:46:33.8957044Z * [new tag] trunk/4e42aa8ffc44b8340eb0eeaf80a2cafc4763a186 -> trunk/4e42aa8ffc44b8340eb0eeaf80a2cafc4763a186 2025-09-07T07:46:33.8957439Z * [new tag] trunk/4f72d932feee0749397fec876dcd43994f50b215 -> trunk/4f72d932feee0749397fec876dcd43994f50b215 2025-09-07T07:46:33.8957829Z * [new tag] trunk/50fc22dedf3c4a27be61fa05551c4f320281b42d -> trunk/50fc22dedf3c4a27be61fa05551c4f320281b42d 2025-09-07T07:46:33.8958226Z * [new tag] trunk/5211f1f908907ffc064b56e43cf8659f7fc22aa9 -> trunk/5211f1f908907ffc064b56e43cf8659f7fc22aa9 2025-09-07T07:46:33.8958613Z * [new tag] trunk/524b78d4f67045b83bb69edc56ab16efe282971c -> trunk/524b78d4f67045b83bb69edc56ab16efe282971c 2025-09-07T07:46:33.8959013Z * [new tag] trunk/54e275e0d81fe1e1ccfa4fb5f2a5a9aaca00ca15 -> trunk/54e275e0d81fe1e1ccfa4fb5f2a5a9aaca00ca15 2025-09-07T07:46:33.8959397Z * [new tag] trunk/5561e45758d59c94605873d5db48ed459c004c3b -> trunk/5561e45758d59c94605873d5db48ed459c004c3b 2025-09-07T07:46:33.8959775Z * [new tag] trunk/57278d45f046d4f89f45d373b1af4dd56934ff24 -> trunk/57278d45f046d4f89f45d373b1af4dd56934ff24 2025-09-07T07:46:33.8960163Z * [new tag] trunk/5927a70934ccf7b70182d364c23245a7dd685503 -> trunk/5927a70934ccf7b70182d364c23245a7dd685503 2025-09-07T07:46:33.8960636Z * [new tag] trunk/5985e28912aeb40b103ebfcf2fd0665eb4a50599 -> trunk/5985e28912aeb40b103ebfcf2fd0665eb4a50599 2025-09-07T07:46:33.8961046Z * [new tag] trunk/5a2da090ed6db88bb657c4e51ec0b310cd08bff6 -> trunk/5a2da090ed6db88bb657c4e51ec0b310cd08bff6 2025-09-07T07:46:33.8961437Z * [new tag] trunk/5c473e9f5ee0ef0fc38e6cf34a95b547f8cdc8d5 -> trunk/5c473e9f5ee0ef0fc38e6cf34a95b547f8cdc8d5 2025-09-07T07:46:33.8961825Z * [new tag] trunk/5c67426d6847667a7c55a2dd01f470fa37238c18 -> trunk/5c67426d6847667a7c55a2dd01f470fa37238c18 2025-09-07T07:46:33.8962207Z * [new tag] trunk/5da573c42c332bc68d4b7946c69f690a876d951a -> trunk/5da573c42c332bc68d4b7946c69f690a876d951a 2025-09-07T07:46:33.8962583Z * [new tag] trunk/5e5870e858f60ff4bf87d03f3592097e934a9580 -> trunk/5e5870e858f60ff4bf87d03f3592097e934a9580 2025-09-07T07:46:33.8963226Z * [new tag] trunk/5f3cbc9442aa55b5afb29f4ac8ca9be569003e84 -> trunk/5f3cbc9442aa55b5afb29f4ac8ca9be569003e84 2025-09-07T07:46:33.8963616Z * [new tag] trunk/600c25e9a17fe56e3dee872be8854db08916ba0c -> trunk/600c25e9a17fe56e3dee872be8854db08916ba0c 2025-09-07T07:46:33.8964027Z * [new tag] trunk/601ae8e4831fc8123fffcfb8fd2e6b6381b42e14 -> trunk/601ae8e4831fc8123fffcfb8fd2e6b6381b42e14 2025-09-07T07:46:33.8964407Z * [new tag] trunk/6087ef41e54c2494b117ffd923faf20f515a6806 -> trunk/6087ef41e54c2494b117ffd923faf20f515a6806 2025-09-07T07:46:33.8964811Z * [new tag] trunk/626cb7df8161dd4ecb4fe43b60f37ce9076f56b1 -> trunk/626cb7df8161dd4ecb4fe43b60f37ce9076f56b1 2025-09-07T07:46:33.8965199Z * [new tag] trunk/62c3f9a97fd3dea7132a93066d32d893ffe101e6 -> trunk/62c3f9a97fd3dea7132a93066d32d893ffe101e6 2025-09-07T07:46:33.8965598Z * [new tag] trunk/63a9c23fe99eacfd09610c36dfe8f01b053c1a35 -> trunk/63a9c23fe99eacfd09610c36dfe8f01b053c1a35 2025-09-07T07:46:33.8965974Z * [new tag] trunk/65985937d97505f648b6ed852c3129f2dd08b251 -> trunk/65985937d97505f648b6ed852c3129f2dd08b251 2025-09-07T07:46:33.8966457Z * [new tag] trunk/66f3b4a682a6153517dd23369fdc3289b6494b07 -> trunk/66f3b4a682a6153517dd23369fdc3289b6494b07 2025-09-07T07:46:33.8966849Z * [new tag] trunk/6737e2c996990024187ba620d2764f3b6f6add2c -> trunk/6737e2c996990024187ba620d2764f3b6f6add2c 2025-09-07T07:46:33.8967228Z * [new tag] trunk/67c31dcd364f10072a55f4a30ffd1151c686283a -> trunk/67c31dcd364f10072a55f4a30ffd1151c686283a 2025-09-07T07:46:33.8967624Z * [new tag] trunk/68738beff73e9c3512e18b4edea811a897ce42db -> trunk/68738beff73e9c3512e18b4edea811a897ce42db 2025-09-07T07:46:33.8968001Z * [new tag] trunk/69a25f68884a168550695fdb1a7c310c54d29536 -> trunk/69a25f68884a168550695fdb1a7c310c54d29536 2025-09-07T07:46:33.8968386Z * [new tag] trunk/6b1900c22f1a07b9519346898d4c71d8a2b0f12f -> trunk/6b1900c22f1a07b9519346898d4c71d8a2b0f12f 2025-09-07T07:46:33.8968773Z * [new tag] trunk/6b8b3ac4403f771bd4a8f9a45d93347304148774 -> trunk/6b8b3ac4403f771bd4a8f9a45d93347304148774 2025-09-07T07:46:33.8969154Z * [new tag] trunk/6f7608d603834d6068b2e7a5d59bec3973b6bb1b -> trunk/6f7608d603834d6068b2e7a5d59bec3973b6bb1b 2025-09-07T07:46:33.8969547Z * [new tag] trunk/70d36e047dfb3488fd6335016711a784d810ebda -> trunk/70d36e047dfb3488fd6335016711a784d810ebda 2025-09-07T07:46:33.8969925Z * [new tag] trunk/71992dd805ff9d6763f77214dfe8b0465e88c87b -> trunk/71992dd805ff9d6763f77214dfe8b0465e88c87b 2025-09-07T07:46:33.8970322Z * [new tag] trunk/734ce8eba9c69381f187359bf0fef1d71d84cd20 -> trunk/734ce8eba9c69381f187359bf0fef1d71d84cd20 2025-09-07T07:46:33.8970704Z * [new tag] trunk/73eb4511fb863a37944342b7e92aae706de603c8 -> trunk/73eb4511fb863a37944342b7e92aae706de603c8 2025-09-07T07:46:33.8971100Z * [new tag] trunk/75bc23cfc345bd4c05e7f97c416c4b3d2d1fa64b -> trunk/75bc23cfc345bd4c05e7f97c416c4b3d2d1fa64b 2025-09-07T07:46:33.8971574Z * [new tag] trunk/771f369448321a387f2018535bc8b8b6e5f12fab -> trunk/771f369448321a387f2018535bc8b8b6e5f12fab 2025-09-07T07:46:33.8971962Z * [new tag] trunk/789d4942127143f2adcb53612c058ce4c9a2cf20 -> trunk/789d4942127143f2adcb53612c058ce4c9a2cf20 2025-09-07T07:46:33.8972345Z * [new tag] trunk/791eff96c85678c950888f9da24650083ee673fe -> trunk/791eff96c85678c950888f9da24650083ee673fe 2025-09-07T07:46:33.8972740Z * [new tag] trunk/793fc12aff1f69fbbf9f4278182fb52bbe350fc9 -> trunk/793fc12aff1f69fbbf9f4278182fb52bbe350fc9 2025-09-07T07:46:33.8973137Z * [new tag] trunk/79fcd5247a9a129eee526a14df30bfc6a22b3f01 -> trunk/79fcd5247a9a129eee526a14df30bfc6a22b3f01 2025-09-07T07:46:33.8973519Z * [new tag] trunk/7f4ff79210eb06924f223ae3a1941ee0e2635348 -> trunk/7f4ff79210eb06924f223ae3a1941ee0e2635348 2025-09-07T07:46:33.8973903Z * [new tag] trunk/8076a185c85112be62be292eb47409c88a585b1c -> trunk/8076a185c85112be62be292eb47409c88a585b1c 2025-09-07T07:46:33.8974285Z * [new tag] trunk/80dd397f1979371a5583fa3d5c7352029522a78d -> trunk/80dd397f1979371a5583fa3d5c7352029522a78d 2025-09-07T07:46:33.8974657Z * [new tag] trunk/8171d6052ec12628eb67e0040839314056014429 -> trunk/8171d6052ec12628eb67e0040839314056014429 2025-09-07T07:46:33.8975055Z * [new tag] trunk/81aeefa657b7ccc26b275c50a9f33b2f056e8071 -> trunk/81aeefa657b7ccc26b275c50a9f33b2f056e8071 2025-09-07T07:46:33.8975436Z * [new tag] trunk/81b7b16618bda250ce55982894a83dc0805eb64c -> trunk/81b7b16618bda250ce55982894a83dc0805eb64c 2025-09-07T07:46:33.8975825Z * [new tag] trunk/827f0d405448de31f79d1089f7d7fceab2f87895 -> trunk/827f0d405448de31f79d1089f7d7fceab2f87895 2025-09-07T07:46:33.8976210Z * [new tag] trunk/82f63c8f6de63c30132a8ac299b6e8c2fd0d3fe8 -> trunk/82f63c8f6de63c30132a8ac299b6e8c2fd0d3fe8 2025-09-07T07:46:33.8976599Z * [new tag] trunk/850e1382a9c56bfde18af09d3e72352d775e9435 -> trunk/850e1382a9c56bfde18af09d3e72352d775e9435 2025-09-07T07:46:33.8977148Z * [new tag] trunk/8678d831c48e616b717bff50f2d03141d2e9f965 -> trunk/8678d831c48e616b717bff50f2d03141d2e9f965 2025-09-07T07:46:33.8977633Z * [new tag] trunk/869cbcc16e489a4f5a14a93d5779b0ea86061c60 -> trunk/869cbcc16e489a4f5a14a93d5779b0ea86061c60 2025-09-07T07:46:33.8978023Z * [new tag] trunk/8703debf669bc2238211bfd039f4ecdd8228b7f7 -> trunk/8703debf669bc2238211bfd039f4ecdd8228b7f7 2025-09-07T07:46:33.8978421Z * [new tag] trunk/874069fbe46e82da5cfa405e6c0deb12e89ff608 -> trunk/874069fbe46e82da5cfa405e6c0deb12e89ff608 2025-09-07T07:46:33.8978814Z * [new tag] trunk/8875d6e394da2fffd04f31b28bf258c94d4776a3 -> trunk/8875d6e394da2fffd04f31b28bf258c94d4776a3 2025-09-07T07:46:33.8979197Z * [new tag] trunk/88d94d17e8c5155451393afa6eb3bab48ab61c16 -> trunk/88d94d17e8c5155451393afa6eb3bab48ab61c16 2025-09-07T07:46:33.8979595Z * [new tag] trunk/890626632def7e0ef95a2d01e87a0e4627824a9f -> trunk/890626632def7e0ef95a2d01e87a0e4627824a9f 2025-09-07T07:46:33.8979981Z * [new tag] trunk/8975cda2520b7b1b5bc3b4d8213edf261fa82570 -> trunk/8975cda2520b7b1b5bc3b4d8213edf261fa82570 2025-09-07T07:46:33.8980370Z * [new tag] trunk/89d41d3f61d04f14730ec26f008a59bef6624610 -> trunk/89d41d3f61d04f14730ec26f008a59bef6624610 2025-09-07T07:46:33.8980753Z * [new tag] trunk/8bb213b6d599ef1273fe52f9b1f6d476056c3a41 -> trunk/8bb213b6d599ef1273fe52f9b1f6d476056c3a41 2025-09-07T07:46:33.8981147Z * [new tag] trunk/8e23a1227b5fb2e39afaa7d57c075a75b640a5af -> trunk/8e23a1227b5fb2e39afaa7d57c075a75b640a5af 2025-09-07T07:46:33.8981535Z * [new tag] trunk/8ec551bb354ab2b85fbbba9d461740a20366d248 -> trunk/8ec551bb354ab2b85fbbba9d461740a20366d248 2025-09-07T07:46:33.8982039Z * [new tag] trunk/8fd3c9ce919c8d5c645fd348bba517e948cbc29d -> trunk/8fd3c9ce919c8d5c645fd348bba517e948cbc29d 2025-09-07T07:46:33.8982415Z * [new tag] trunk/90f50f7e68e120d9574e6e3189e37b4280010ad9 -> trunk/90f50f7e68e120d9574e6e3189e37b4280010ad9 2025-09-07T07:46:33.8982802Z * [new tag] trunk/91f0bcf43fc0bc743350d491ac63b77e92054ac9 -> trunk/91f0bcf43fc0bc743350d491ac63b77e92054ac9 2025-09-07T07:46:33.8983189Z * [new tag] trunk/92576a594b8121f6b0b1b5a3ea16d08792fc68ab -> trunk/92576a594b8121f6b0b1b5a3ea16d08792fc68ab 2025-09-07T07:46:33.8983573Z * [new tag] trunk/92a43025e0baa1f2ce345f28d22913b518a1ab9d -> trunk/92a43025e0baa1f2ce345f28d22913b518a1ab9d 2025-09-07T07:46:33.8983960Z * [new tag] trunk/93fb23d6fae7c4e82c4239a1033e522088742634 -> trunk/93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:46:33.8984347Z * [new tag] trunk/9458d1ac3bd70c2af316a8ba95d2c6c9c1199c9c -> trunk/9458d1ac3bd70c2af316a8ba95d2c6c9c1199c9c 2025-09-07T07:46:33.8984745Z * [new tag] trunk/9480cdc0b61488c89a23c2f64f43b2dcedc8728e -> trunk/9480cdc0b61488c89a23c2f64f43b2dcedc8728e 2025-09-07T07:46:33.8985127Z * [new tag] trunk/9491d289b329e4ba4a9f5f5b1be7960671bb7840 -> trunk/9491d289b329e4ba4a9f5f5b1be7960671bb7840 2025-09-07T07:46:33.8985506Z * [new tag] trunk/9499c8761cd2067feb9877414e818f6fd00290f1 -> trunk/9499c8761cd2067feb9877414e818f6fd00290f1 2025-09-07T07:46:33.8985906Z * [new tag] trunk/95ee0bfea99d3d346d6502b91b497d2b35795504 -> trunk/95ee0bfea99d3d346d6502b91b497d2b35795504 2025-09-07T07:46:33.8986289Z * [new tag] trunk/98374612fc2febd686be20761e56bdc2424bc36a -> trunk/98374612fc2febd686be20761e56bdc2424bc36a 2025-09-07T07:46:33.8986680Z * [new tag] trunk/98efc9e93d8fc61eb53cb91378443617cb550500 -> trunk/98efc9e93d8fc61eb53cb91378443617cb550500 2025-09-07T07:46:33.8987077Z * [new tag] trunk/994f2a5dbcbdc915da39bf6f6ce4d1f5e74835c9 -> trunk/994f2a5dbcbdc915da39bf6f6ce4d1f5e74835c9 2025-09-07T07:46:33.8987550Z * [new tag] trunk/99f356fa58c8d726cef022d8710f5491291158f6 -> trunk/99f356fa58c8d726cef022d8710f5491291158f6 2025-09-07T07:46:33.8987938Z * [new tag] trunk/9a1c5c0a078b94d13ac5c1ae0d754d19fb73bf99 -> trunk/9a1c5c0a078b94d13ac5c1ae0d754d19fb73bf99 2025-09-07T07:46:33.8988334Z * [new tag] trunk/9a665ca3c472384e9d722bddba79e5a7680f1abd -> trunk/9a665ca3c472384e9d722bddba79e5a7680f1abd 2025-09-07T07:46:33.8988713Z * [new tag] trunk/9aedb3cd87b52160872173c177f61053d97bed57 -> trunk/9aedb3cd87b52160872173c177f61053d97bed57 2025-09-07T07:46:33.8989088Z * [new tag] trunk/9b81fe281da41f2421506339d26b027a468902f4 -> trunk/9b81fe281da41f2421506339d26b027a468902f4 2025-09-07T07:46:33.8989499Z * [new tag] trunk/9bdcee01f86e2969cff1140cdecfca13cb51816e -> trunk/9bdcee01f86e2969cff1140cdecfca13cb51816e 2025-09-07T07:46:33.8989889Z * [new tag] trunk/9c03d6be87eedc06e524e202e07a7e776551a839 -> trunk/9c03d6be87eedc06e524e202e07a7e776551a839 2025-09-07T07:46:33.8990286Z * [new tag] trunk/9c957723a0fedd9c637e63e023a613019e2cab60 -> trunk/9c957723a0fedd9c637e63e023a613019e2cab60 2025-09-07T07:46:33.8990664Z * [new tag] trunk/9e5247f51d81735e5f1e65e80588985fa93bccc5 -> trunk/9e5247f51d81735e5f1e65e80588985fa93bccc5 2025-09-07T07:46:33.8991064Z * [new tag] trunk/9eadb37cdd699f7e8e8177a5227bfeb16184ef26 -> trunk/9eadb37cdd699f7e8e8177a5227bfeb16184ef26 2025-09-07T07:46:33.8991454Z * [new tag] trunk/a00cdc1e4159db73c9ffb3f25e93e55877709a29 -> trunk/a00cdc1e4159db73c9ffb3f25e93e55877709a29 2025-09-07T07:46:33.8991838Z * [new tag] trunk/a02ee4a816d11380c6f564c1aba64d56af5ba705 -> trunk/a02ee4a816d11380c6f564c1aba64d56af5ba705 2025-09-07T07:46:33.8992227Z * [new tag] trunk/a3c7f77e50f900721817934120d60c2361b3c40d -> trunk/a3c7f77e50f900721817934120d60c2361b3c40d 2025-09-07T07:46:33.8992719Z * [new tag] trunk/a3d72b09ae12126a2b7d4a63a45ac100a882a802 -> trunk/a3d72b09ae12126a2b7d4a63a45ac100a882a802 2025-09-07T07:46:33.8993109Z * [new tag] trunk/a3e5466002791da609fcb069155d8ee347baee92 -> trunk/a3e5466002791da609fcb069155d8ee347baee92 2025-09-07T07:46:33.8993492Z * [new tag] trunk/a714437093ed196eee28f7de454cf4c41badc098 -> trunk/a714437093ed196eee28f7de454cf4c41badc098 2025-09-07T07:46:33.8993885Z * [new tag] trunk/a75e8cd27098f290de0b7439685d05ce02e91356 -> trunk/a75e8cd27098f290de0b7439685d05ce02e91356 2025-09-07T07:46:33.8994269Z * [new tag] trunk/a8d6943d36c1c2a5f90d3573460695bad4b623ae -> trunk/a8d6943d36c1c2a5f90d3573460695bad4b623ae 2025-09-07T07:46:33.8994675Z * [new tag] trunk/a918bbad6ab20649ff82eefb48417ecbe96bcb34 -> trunk/a918bbad6ab20649ff82eefb48417ecbe96bcb34 2025-09-07T07:46:33.8995066Z * [new tag] trunk/a99d8d39bc842d6ebc3e368b178e4884d24b056e -> trunk/a99d8d39bc842d6ebc3e368b178e4884d24b056e 2025-09-07T07:46:33.8995454Z * [new tag] trunk/aac1a50a191b4102d566c9c1ea22f06d6c2e3f02 -> trunk/aac1a50a191b4102d566c9c1ea22f06d6c2e3f02 2025-09-07T07:46:33.8995857Z * [new tag] trunk/aad96a202244c7d0d120c04ba8db593edd8c0f92 -> trunk/aad96a202244c7d0d120c04ba8db593edd8c0f92 2025-09-07T07:46:33.8996247Z * [new tag] trunk/ab643e4dbbaf7b663d4237514cbf01af9b11565c -> trunk/ab643e4dbbaf7b663d4237514cbf01af9b11565c 2025-09-07T07:46:33.8996647Z * [new tag] trunk/abc447174cd2cf8591edbc70a9f836f9a5779f47 -> trunk/abc447174cd2cf8591edbc70a9f836f9a5779f47 2025-09-07T07:46:33.8997049Z * [new tag] trunk/acece97c3a9dceb63194e314da93fdf37cf15a0d -> trunk/acece97c3a9dceb63194e314da93fdf37cf15a0d 2025-09-07T07:46:33.8997436Z * [new tag] trunk/ada43ed39c80b746b4822c92640a1882619e2795 -> trunk/ada43ed39c80b746b4822c92640a1882619e2795 2025-09-07T07:46:33.8997834Z * [new tag] trunk/adae7f66aacf3f248c3101b858cf98d5809119fa -> trunk/adae7f66aacf3f248c3101b858cf98d5809119fa 2025-09-07T07:46:33.8998319Z * [new tag] trunk/ae0edc133e61e3b16caf0b2ee0ff3f33ab72af4c -> trunk/ae0edc133e61e3b16caf0b2ee0ff3f33ab72af4c 2025-09-07T07:46:33.8998710Z * [new tag] trunk/aed33a8fcbd60b052d4559d261390c5797129c6d -> trunk/aed33a8fcbd60b052d4559d261390c5797129c6d 2025-09-07T07:46:33.8999085Z * [new tag] trunk/b04e922712080a3652e438d05e8bb74e0cd2d238 -> trunk/b04e922712080a3652e438d05e8bb74e0cd2d238 2025-09-07T07:46:33.8999486Z * [new tag] trunk/b0a3e58dd71c1a039ac0ef51e5bd8f704f632f6f -> trunk/b0a3e58dd71c1a039ac0ef51e5bd8f704f632f6f 2025-09-07T07:46:33.8999879Z * [new tag] trunk/b16d3f4c8c01d461c2f01064e9ca5fa2b33f5cf1 -> trunk/b16d3f4c8c01d461c2f01064e9ca5fa2b33f5cf1 2025-09-07T07:46:33.9000269Z * [new tag] trunk/b18bb6796f210a183e687d9d64984a5a9d13cf09 -> trunk/b18bb6796f210a183e687d9d64984a5a9d13cf09 2025-09-07T07:46:33.9000673Z * [new tag] trunk/b1bb98ddebdd3e41bf7987372409bdce96ae55de -> trunk/b1bb98ddebdd3e41bf7987372409bdce96ae55de 2025-09-07T07:46:33.9001058Z * [new tag] trunk/b2b4add0e754411372060e1d7b4057a66439172b -> trunk/b2b4add0e754411372060e1d7b4057a66439172b 2025-09-07T07:46:33.9001455Z * [new tag] trunk/b2c7b9ad2dc5a7c0b61febd307761bd5bc2f0f05 -> trunk/b2c7b9ad2dc5a7c0b61febd307761bd5bc2f0f05 2025-09-07T07:46:33.9001843Z * [new tag] trunk/b40d9432be44a6b5974ee62e7d19c3c61c5ece37 -> trunk/b40d9432be44a6b5974ee62e7d19c3c61c5ece37 2025-09-07T07:46:33.9002230Z * [new tag] trunk/b4ad38279b178b7bd14355123c1101e2e853e77b -> trunk/b4ad38279b178b7bd14355123c1101e2e853e77b 2025-09-07T07:46:33.9002617Z * [new tag] trunk/b67c41039835bd9b20b83cd6233e86baaa5f5dde -> trunk/b67c41039835bd9b20b83cd6233e86baaa5f5dde 2025-09-07T07:46:33.9003266Z * [new tag] trunk/b6d0a9ea9056ede4f7024dbf3bd6c43be3aff49c -> trunk/b6d0a9ea9056ede4f7024dbf3bd6c43be3aff49c 2025-09-07T07:46:33.9003664Z * [new tag] trunk/b7dad7dd49448c88d0751fa2e29c70afe985f734 -> trunk/b7dad7dd49448c88d0751fa2e29c70afe985f734 2025-09-07T07:46:33.9004064Z * [new tag] trunk/b7e207ca9f046ddd716076965a0cce403ba99052 -> trunk/b7e207ca9f046ddd716076965a0cce403ba99052 2025-09-07T07:46:33.9004451Z * [new tag] trunk/b919560c4a7010e2d89facee25586269a994746e -> trunk/b919560c4a7010e2d89facee25586269a994746e 2025-09-07T07:46:33.9004850Z * [new tag] trunk/b9ba612f7a968f7b27e121ca8f4d0a4d954f5354 -> trunk/b9ba612f7a968f7b27e121ca8f4d0a4d954f5354 2025-09-07T07:46:33.9005245Z * [new tag] trunk/ba7f546ccccb5e0b36d9070dc25f26a9647f89f8 -> trunk/ba7f546ccccb5e0b36d9070dc25f26a9647f89f8 2025-09-07T07:46:33.9005625Z * [new tag] trunk/bb950284c7e72905994bc25dd436c10e48088d85 -> trunk/bb950284c7e72905994bc25dd436c10e48088d85 2025-09-07T07:46:33.9006039Z * [new tag] trunk/bbedc71fd3267c639c38b4ec25eaa22f973d9c4d -> trunk/bbedc71fd3267c639c38b4ec25eaa22f973d9c4d 2025-09-07T07:46:33.9006443Z * [new tag] trunk/bc4db2c27fce6ff1648bdc5af31ec225d2a31f37 -> trunk/bc4db2c27fce6ff1648bdc5af31ec225d2a31f37 2025-09-07T07:46:33.9006833Z * [new tag] trunk/bc505977fb66677a09c31155c987330fbb18a865 -> trunk/bc505977fb66677a09c31155c987330fbb18a865 2025-09-07T07:46:33.9007229Z * [new tag] trunk/bd39e47feea7326afb5bbb67fcb1e69279239527 -> trunk/bd39e47feea7326afb5bbb67fcb1e69279239527 2025-09-07T07:46:33.9007634Z * [new tag] trunk/be5b03dde96638f25ffd732a4fed7e41b4cf40e1 -> trunk/be5b03dde96638f25ffd732a4fed7e41b4cf40e1 2025-09-07T07:46:33.9008027Z * [new tag] trunk/bffc7dd1f374d8408911cd22c6b3d6df39ded9b3 -> trunk/bffc7dd1f374d8408911cd22c6b3d6df39ded9b3 2025-09-07T07:46:33.9008430Z * [new tag] trunk/c024b1f5a18d5c5aee5cc2acdd4c52b24b93ffcf -> trunk/c024b1f5a18d5c5aee5cc2acdd4c52b24b93ffcf 2025-09-07T07:46:33.9008923Z * [new tag] trunk/c0983e6cc0acf71689e1851d12609e00b3f59371 -> trunk/c0983e6cc0acf71689e1851d12609e00b3f59371 2025-09-07T07:46:33.9009317Z * [new tag] trunk/c10195e723eeeedd099ed8b73eda7184ca618fad -> trunk/c10195e723eeeedd099ed8b73eda7184ca618fad 2025-09-07T07:46:33.9009714Z * [new tag] trunk/c157cf6488ade6a7ee2ce2d25b059e1335630a99 -> trunk/c157cf6488ade6a7ee2ce2d25b059e1335630a99 2025-09-07T07:46:33.9010094Z * [new tag] trunk/c2a30246172fd71d56529907ffd3c27b76b1f3a7 -> trunk/c2a30246172fd71d56529907ffd3c27b76b1f3a7 2025-09-07T07:46:33.9010476Z * [new tag] trunk/c32111149921b48bfef909293f1049e21619ed76 -> trunk/c32111149921b48bfef909293f1049e21619ed76 2025-09-07T07:46:33.9010853Z * [new tag] trunk/c37103234afc832dcad307e9016230810957c9d5 -> trunk/c37103234afc832dcad307e9016230810957c9d5 2025-09-07T07:46:33.9011255Z * [new tag] trunk/c3ceca2995cd35e1376c4b0704669bff1a81e836 -> trunk/c3ceca2995cd35e1376c4b0704669bff1a81e836 2025-09-07T07:46:33.9011656Z * [new tag] trunk/c3d54dea9febb1236d48d19e5d4876a63f2e20fd -> trunk/c3d54dea9febb1236d48d19e5d4876a63f2e20fd 2025-09-07T07:46:33.9012036Z * [new tag] trunk/c465b3d52c5687fe910d35a5c75341b77f821741 -> trunk/c465b3d52c5687fe910d35a5c75341b77f821741 2025-09-07T07:46:33.9012433Z * [new tag] trunk/c5b8a10be5e89396da916d1069ffcb7135f0372b -> trunk/c5b8a10be5e89396da916d1069ffcb7135f0372b 2025-09-07T07:46:33.9012812Z * [new tag] trunk/c7e41071a08f4045bc11ab60ec366d7357d56e30 -> trunk/c7e41071a08f4045bc11ab60ec366d7357d56e30 2025-09-07T07:46:33.9013225Z * [new tag] trunk/c98ddaca6d2e19ca37aff00c4ff0cda1e9a6ff65 -> trunk/c98ddaca6d2e19ca37aff00c4ff0cda1e9a6ff65 2025-09-07T07:46:33.9013611Z * [new tag] trunk/cb1e31362c7b53acf4ac95b9f8878064c184f03b -> trunk/cb1e31362c7b53acf4ac95b9f8878064c184f03b 2025-09-07T07:46:33.9014109Z * [new tag] trunk/cbfb005f7cce79974795b148e265f594f59477c8 -> trunk/cbfb005f7cce79974795b148e265f594f59477c8 2025-09-07T07:46:33.9014500Z * [new tag] trunk/cc5bdd12401bda835291d2f3cb297132ebdbf358 -> trunk/cc5bdd12401bda835291d2f3cb297132ebdbf358 2025-09-07T07:46:33.9014899Z * [new tag] trunk/cd529b686d54bbaa443f5b310140de48422d96c7 -> trunk/cd529b686d54bbaa443f5b310140de48422d96c7 2025-09-07T07:46:33.9015274Z * [new tag] trunk/cec0ff122815582af5302360aff03676558c5c87 -> trunk/cec0ff122815582af5302360aff03676558c5c87 2025-09-07T07:46:33.9015663Z * [new tag] trunk/d11720efdb563d02cf4f7d324311fb15a755268e -> trunk/d11720efdb563d02cf4f7d324311fb15a755268e 2025-09-07T07:46:33.9016052Z * [new tag] trunk/d1706d9128ae24d9048167e80d3fe5196d19035e -> trunk/d1706d9128ae24d9048167e80d3fe5196d19035e 2025-09-07T07:46:33.9016454Z * [new tag] trunk/d1a15abfdcaef138f2d9e93a9f46be44f30b766d -> trunk/d1a15abfdcaef138f2d9e93a9f46be44f30b766d 2025-09-07T07:46:33.9016851Z * [new tag] trunk/d232a95d4a79404ca05c1f52d37fde7339dcdf49 -> trunk/d232a95d4a79404ca05c1f52d37fde7339dcdf49 2025-09-07T07:46:33.9017239Z * [new tag] trunk/d2d4c8e9b2371c9aacfb771d9402ac7427b9778e -> trunk/d2d4c8e9b2371c9aacfb771d9402ac7427b9778e 2025-09-07T07:46:33.9017699Z * [new tag] trunk/d33840c542b387ab08ba49aa6c45aa9567fd9be7 -> trunk/d33840c542b387ab08ba49aa6c45aa9567fd9be7 2025-09-07T07:46:33.9018082Z * [new tag] trunk/d5643e8f3a648a99636bfa1f2a41d54bd3c0d0f1 -> trunk/d5643e8f3a648a99636bfa1f2a41d54bd3c0d0f1 2025-09-07T07:46:33.9018461Z * [new tag] trunk/d5b38410b5b6cf75c7a7389972777a6497926ee7 -> trunk/d5b38410b5b6cf75c7a7389972777a6497926ee7 2025-09-07T07:46:33.9018861Z * [new tag] trunk/d5e0f4202ba14632e4d14862ace096609e763462 -> trunk/d5e0f4202ba14632e4d14862ace096609e763462 2025-09-07T07:46:33.9019251Z * [new tag] trunk/d636c181f9140a7b59be10b36eae23039fc2bb72 -> trunk/d636c181f9140a7b59be10b36eae23039fc2bb72 2025-09-07T07:46:33.9019722Z * [new tag] trunk/d64718503728001a1e78168fd7f2d4ff23e57285 -> trunk/d64718503728001a1e78168fd7f2d4ff23e57285 2025-09-07T07:46:33.9020098Z * [new tag] trunk/d67c29ad22670320d676b02e394274af34e8e643 -> trunk/d67c29ad22670320d676b02e394274af34e8e643 2025-09-07T07:46:33.9020494Z * [new tag] trunk/d6b74568e2c98ce58ecc145b72ac66d4caf7ce95 -> trunk/d6b74568e2c98ce58ecc145b72ac66d4caf7ce95 2025-09-07T07:46:33.9020874Z * [new tag] trunk/d711f27845abd45007ccab6076649ebd896c2661 -> trunk/d711f27845abd45007ccab6076649ebd896c2661 2025-09-07T07:46:33.9021265Z * [new tag] trunk/d9d6dde0f42d4bcc8c97671ac50d5096c7e500ab -> trunk/d9d6dde0f42d4bcc8c97671ac50d5096c7e500ab 2025-09-07T07:46:33.9021675Z * [new tag] trunk/da4db4b33d1fdd046650cf19fdbac581a19bf2f9 -> trunk/da4db4b33d1fdd046650cf19fdbac581a19bf2f9 2025-09-07T07:46:33.9022078Z * [new tag] trunk/dac8a4b91c01c3bbc96f54e621b1ea4ffdbd29d1 -> trunk/dac8a4b91c01c3bbc96f54e621b1ea4ffdbd29d1 2025-09-07T07:46:33.9022469Z * [new tag] trunk/dbec08729fb9848bebed6048c63831b87170d061 -> trunk/dbec08729fb9848bebed6048c63831b87170d061 2025-09-07T07:46:33.9022842Z * [new tag] trunk/dcf385395d838f38c8dca25913578230dd43099a -> trunk/dcf385395d838f38c8dca25913578230dd43099a 2025-09-07T07:46:33.9023238Z * [new tag] trunk/dd2519abe83ec3c40d4797492434e41fe3b47e17 -> trunk/dd2519abe83ec3c40d4797492434e41fe3b47e17 2025-09-07T07:46:33.9023636Z * [new tag] trunk/dec72ea4b006dd0fbcaaaa106ad273d73807ab9d -> trunk/dec72ea4b006dd0fbcaaaa106ad273d73807ab9d 2025-09-07T07:46:33.9024022Z * [new tag] trunk/e0a62b266c021b910ce6dc02a6c9429210487717 -> trunk/e0a62b266c021b910ce6dc02a6c9429210487717 2025-09-07T07:46:33.9024492Z * [new tag] trunk/e19e02c84c9dcc408375e5cae3b0709c18b99228 -> trunk/e19e02c84c9dcc408375e5cae3b0709c18b99228 2025-09-07T07:46:33.9024885Z * [new tag] trunk/e304ea4e69d3a7deeb7e48c7450c214a4c953937 -> trunk/e304ea4e69d3a7deeb7e48c7450c214a4c953937 2025-09-07T07:46:33.9025276Z * [new tag] trunk/e3068cdb446adefb5a875616ba37a60235391439 -> trunk/e3068cdb446adefb5a875616ba37a60235391439 2025-09-07T07:46:33.9025654Z * [new tag] trunk/e381d4b0205d5f126c1de534f867ba776f7c3ee6 -> trunk/e381d4b0205d5f126c1de534f867ba776f7c3ee6 2025-09-07T07:46:33.9026045Z * [new tag] trunk/e4bd0ff4f8981b805df32ea5b3550621965ea4f2 -> trunk/e4bd0ff4f8981b805df32ea5b3550621965ea4f2 2025-09-07T07:46:33.9026433Z * [new tag] trunk/e532c9d4f1cdcbc1ea9628f55b9813e77847bdc7 -> trunk/e532c9d4f1cdcbc1ea9628f55b9813e77847bdc7 2025-09-07T07:46:33.9026817Z * [new tag] trunk/e92cd9415377403b6e90585e764639e2e0b5973b -> trunk/e92cd9415377403b6e90585e764639e2e0b5973b 2025-09-07T07:46:33.9027195Z * [new tag] trunk/e9481b6617b5576b099d8ca5798111592e9ad090 -> trunk/e9481b6617b5576b099d8ca5798111592e9ad090 2025-09-07T07:46:33.9027585Z * [new tag] trunk/ea1883dfd3e42defe37b11202b878bb76defa087 -> trunk/ea1883dfd3e42defe37b11202b878bb76defa087 2025-09-07T07:46:33.9027997Z * [new tag] trunk/eac3d6f04cfbbebe3d470dacd216da7d4b1f95a8 -> trunk/eac3d6f04cfbbebe3d470dacd216da7d4b1f95a8 2025-09-07T07:46:33.9028379Z * [new tag] trunk/eb18d32bda75189494d955aa001ade15f10333de -> trunk/eb18d32bda75189494d955aa001ade15f10333de 2025-09-07T07:46:33.9028783Z * [new tag] trunk/ef3be6726f7ff4b77c22db10cec5b686f9107ea9 -> trunk/ef3be6726f7ff4b77c22db10cec5b686f9107ea9 2025-09-07T07:46:33.9029177Z * [new tag] trunk/ef8aabd42422725026cb4dbf48aafa9efa226a04 -> trunk/ef8aabd42422725026cb4dbf48aafa9efa226a04 2025-09-07T07:46:33.9029570Z * [new tag] trunk/f00445b43eee57e20bb9316fa796ca23bf73373b -> trunk/f00445b43eee57e20bb9316fa796ca23bf73373b 2025-09-07T07:46:33.9030026Z * [new tag] trunk/f0c391102b754e3b145e8c59231d2df563487e37 -> trunk/f0c391102b754e3b145e8c59231d2df563487e37 2025-09-07T07:46:33.9030412Z * [new tag] trunk/f27985b7e796fb66a1b476284ba42d8cb360a751 -> trunk/f27985b7e796fb66a1b476284ba42d8cb360a751 2025-09-07T07:46:33.9030791Z * [new tag] trunk/f36f285953700f971552083a5da9d0ceacb63bbd -> trunk/f36f285953700f971552083a5da9d0ceacb63bbd 2025-09-07T07:46:33.9031180Z * [new tag] trunk/f3cebec39ebc110e1c8b06e741896585f7892dbb -> trunk/f3cebec39ebc110e1c8b06e741896585f7892dbb 2025-09-07T07:46:33.9031585Z * [new tag] trunk/f4c33cd44acac92c0b451a04da20ebe9370e5b0c -> trunk/f4c33cd44acac92c0b451a04da20ebe9370e5b0c 2025-09-07T07:46:33.9031963Z * [new tag] trunk/f612045ce105f008b2b675e2fc870163babeb2e8 -> trunk/f612045ce105f008b2b675e2fc870163babeb2e8 2025-09-07T07:46:33.9032365Z * [new tag] trunk/f8746b878dfc1e9639d42cbde832e9b9e792c86c -> trunk/f8746b878dfc1e9639d42cbde832e9b9e792c86c 2025-09-07T07:46:33.9032753Z * [new tag] trunk/f8ffa9194e26523e5f976d4a824d5cc58922727c -> trunk/f8ffa9194e26523e5f976d4a824d5cc58922727c 2025-09-07T07:46:33.9033141Z * [new tag] trunk/f981a7fa5230b98974291fdde32fe8488bc5d469 -> trunk/f981a7fa5230b98974291fdde32fe8488bc5d469 2025-09-07T07:46:33.9033541Z * [new tag] trunk/fbf3d2027daabbcb44d0af274b139be2a248a4f7 -> trunk/fbf3d2027daabbcb44d0af274b139be2a248a4f7 2025-09-07T07:46:33.9033925Z * [new tag] trunk/fca2601c9d628e1bd2d75c7318cd22c4e8c832aa -> trunk/fca2601c9d628e1bd2d75c7318cd22c4e8c832aa 2025-09-07T07:46:33.9034318Z * [new tag] trunk/fea20775ad96bdca972a1811d7d3372f368614ab -> trunk/fea20775ad96bdca972a1811d7d3372f368614ab 2025-09-07T07:46:33.9034793Z * [new tag] trunk/fefee081642f87419a21dc852f7167d4640443cd -> trunk/fefee081642f87419a21dc852f7167d4640443cd 2025-09-07T07:46:33.9034938Z * [new tag] v0.1.1 -> v0.1.1 2025-09-07T07:46:33.9035078Z * [new tag] v0.1.10 -> v0.1.10 2025-09-07T07:46:33.9035215Z * [new tag] v0.1.11 -> v0.1.11 2025-09-07T07:46:33.9035348Z * [new tag] v0.1.12 -> v0.1.12 2025-09-07T07:46:33.9035480Z * [new tag] v0.1.2 -> v0.1.2 2025-09-07T07:46:33.9035620Z * [new tag] v0.1.3 -> v0.1.3 2025-09-07T07:46:33.9035746Z * [new tag] v0.1.4 -> v0.1.4 2025-09-07T07:46:33.9035884Z * [new tag] v0.1.5 -> v0.1.5 2025-09-07T07:46:33.9036010Z * [new tag] v0.1.6 -> v0.1.6 2025-09-07T07:46:33.9036141Z * [new tag] v0.1.7 -> v0.1.7 2025-09-07T07:46:33.9036277Z * [new tag] v0.1.8 -> v0.1.8 2025-09-07T07:46:33.9036406Z * [new tag] v0.1.9 -> v0.1.9 2025-09-07T07:46:33.9036546Z * [new tag] v0.2.0 -> v0.2.0 2025-09-07T07:46:33.9036673Z * [new tag] v0.3.0 -> v0.3.0 2025-09-07T07:46:33.9036809Z * [new tag] v0.3.1 -> v0.3.1 2025-09-07T07:46:33.9036935Z * [new tag] v0.4.0 -> v0.4.0 2025-09-07T07:46:33.9037060Z * [new tag] v0.4.1 -> v0.4.1 2025-09-07T07:46:33.9037198Z * [new tag] v1.0.0 -> v1.0.0 2025-09-07T07:46:33.9037335Z * [new tag] v1.0.0a0 -> v1.0.0a0 2025-09-07T07:46:33.9037473Z * [new tag] v1.0.1 -> v1.0.1 2025-09-07T07:46:33.9037609Z * [new tag] v1.0rc0 -> v1.0rc0 2025-09-07T07:46:33.9037741Z * [new tag] v1.0rc1 -> v1.0rc1 2025-09-07T07:46:33.9037956Z * [new tag] v1.1.0 -> v1.1.0 2025-09-07T07:46:33.9038096Z * [new tag] v1.1.0a0 -> v1.1.0a0 2025-09-07T07:46:33.9038231Z * [new tag] v1.10.0 -> v1.10.0 2025-09-07T07:46:33.9038377Z * [new tag] v1.10.0-rc1 -> v1.10.0-rc1 2025-09-07T07:46:33.9038520Z * [new tag] v1.10.0-rc2 -> v1.10.0-rc2 2025-09-07T07:46:33.9038663Z * [new tag] v1.10.0-rc3 -> v1.10.0-rc3 2025-09-07T07:46:33.9038796Z * [new tag] v1.10.1 -> v1.10.1 2025-09-07T07:46:33.9038939Z * [new tag] v1.10.1-rc1 -> v1.10.1-rc1 2025-09-07T07:46:33.9039075Z * [new tag] v1.10.2 -> v1.10.2 2025-09-07T07:46:33.9039217Z * [new tag] v1.10.2-rc1 -> v1.10.2-rc1 2025-09-07T07:46:33.9039351Z * [new tag] v1.11.0 -> v1.11.0 2025-09-07T07:46:33.9039490Z * [new tag] v1.11.0-rc1 -> v1.11.0-rc1 2025-09-07T07:46:33.9039634Z * [new tag] v1.11.0-rc2 -> v1.11.0-rc2 2025-09-07T07:46:33.9039773Z * [new tag] v1.11.0-rc3 -> v1.11.0-rc3 2025-09-07T07:46:33.9039918Z * [new tag] v1.11.0-rc4 -> v1.11.0-rc4 2025-09-07T07:46:33.9040057Z * [new tag] v1.11.0-rc5 -> v1.11.0-rc5 2025-09-07T07:46:33.9040198Z * [new tag] v1.11.0-rc6 -> v1.11.0-rc6 2025-09-07T07:46:33.9040343Z * [new tag] v1.11.0-rc7 -> v1.11.0-rc7 2025-09-07T07:46:33.9040475Z * [new tag] v1.12.0 -> v1.12.0 2025-09-07T07:46:33.9040702Z * [new tag] v1.12.0-rc1 -> v1.12.0-rc1 2025-09-07T07:46:33.9040844Z * [new tag] v1.12.0-rc2 -> v1.12.0-rc2 2025-09-07T07:46:33.9040983Z * [new tag] v1.12.0-rc3 -> v1.12.0-rc3 2025-09-07T07:46:33.9041126Z * [new tag] v1.12.0-rc4 -> v1.12.0-rc4 2025-09-07T07:46:33.9041263Z * [new tag] v1.12.0-rc5 -> v1.12.0-rc5 2025-09-07T07:46:33.9041406Z * [new tag] v1.12.0-rc6 -> v1.12.0-rc6 2025-09-07T07:46:33.9041544Z * [new tag] v1.12.0-rc7 -> v1.12.0-rc7 2025-09-07T07:46:33.9041683Z * [new tag] v1.12.0-rc8 -> v1.12.0-rc8 2025-09-07T07:46:33.9041822Z * [new tag] v1.12.1 -> v1.12.1 2025-09-07T07:46:33.9041963Z * [new tag] v1.12.1-rc1 -> v1.12.1-rc1 2025-09-07T07:46:33.9042116Z * [new tag] v1.12.1-rc2 -> v1.12.1-rc2 2025-09-07T07:46:33.9042253Z * [new tag] v1.12.1-rc3 -> v1.12.1-rc3 2025-09-07T07:46:33.9042400Z * [new tag] v1.12.1-rc4 -> v1.12.1-rc4 2025-09-07T07:46:33.9042538Z * [new tag] v1.12.1-rc5 -> v1.12.1-rc5 2025-09-07T07:46:33.9042670Z * [new tag] v1.13.0 -> v1.13.0 2025-09-07T07:46:33.9042815Z * [new tag] v1.13.0-rc1 -> v1.13.0-rc1 2025-09-07T07:46:33.9043083Z * [new tag] v1.13.0-rc2 -> v1.13.0-rc2 2025-09-07T07:46:33.9043236Z * [new tag] v1.13.0-rc3 -> v1.13.0-rc3 2025-09-07T07:46:33.9043377Z * [new tag] v1.13.0-rc4 -> v1.13.0-rc4 2025-09-07T07:46:33.9043518Z * [new tag] v1.13.0-rc5 -> v1.13.0-rc5 2025-09-07T07:46:33.9043672Z * [new tag] v1.13.0-rc6 -> v1.13.0-rc6 2025-09-07T07:46:33.9043805Z * [new tag] v1.13.1 -> v1.13.1 2025-09-07T07:46:33.9044047Z * [new tag] v1.13.1-rc1 -> v1.13.1-rc1 2025-09-07T07:46:33.9044181Z * [new tag] v1.2.0 -> v1.2.0 2025-09-07T07:46:33.9044321Z * [new tag] v1.2.0a0 -> v1.2.0a0 2025-09-07T07:46:33.9044459Z * [new tag] v1.3.0 -> v1.3.0 2025-09-07T07:46:33.9044597Z * [new tag] v1.3.0a0 -> v1.3.0a0 2025-09-07T07:46:33.9044735Z * [new tag] v1.3.1 -> v1.3.1 2025-09-07T07:46:33.9044865Z * [new tag] v1.4.0 -> v1.4.0 2025-09-07T07:46:33.9045007Z * [new tag] v1.4.0a0 -> v1.4.0a0 2025-09-07T07:46:33.9045140Z * [new tag] v1.4.1 -> v1.4.1 2025-09-07T07:46:33.9045273Z * [new tag] v1.5.0 -> v1.5.0 2025-09-07T07:46:33.9045422Z * [new tag] v1.5.0-rc1 -> v1.5.0-rc1 2025-09-07T07:46:33.9045565Z * [new tag] v1.5.0-rc2 -> v1.5.0-rc2 2025-09-07T07:46:33.9045716Z * [new tag] v1.5.0-rc3 -> v1.5.0-rc3 2025-09-07T07:46:33.9045854Z * [new tag] v1.5.0-rc4 -> v1.5.0-rc4 2025-09-07T07:46:33.9045994Z * [new tag] v1.5.0-rc5 -> v1.5.0-rc5 2025-09-07T07:46:33.9046129Z * [new tag] v1.5.1 -> v1.5.1 2025-09-07T07:46:33.9046268Z * [new tag] v1.5.1-rc1 -> v1.5.1-rc1 2025-09-07T07:46:33.9046405Z * [new tag] v1.6.0 -> v1.6.0 2025-09-07T07:46:33.9046544Z * [new tag] v1.6.0-rc1 -> v1.6.0-rc1 2025-09-07T07:46:33.9046777Z * [new tag] v1.6.0-rc2 -> v1.6.0-rc2 2025-09-07T07:46:33.9046924Z * [new tag] v1.6.0-rc3 -> v1.6.0-rc3 2025-09-07T07:46:33.9047069Z * [new tag] v1.6.0-rc4 -> v1.6.0-rc4 2025-09-07T07:46:33.9047210Z * [new tag] v1.6.0-rc5 -> v1.6.0-rc5 2025-09-07T07:46:33.9047348Z * [new tag] v1.6.0-rc6 -> v1.6.0-rc6 2025-09-07T07:46:33.9047491Z * [new tag] v1.6.0-rc7 -> v1.6.0-rc7 2025-09-07T07:46:33.9047623Z * [new tag] v1.7.0 -> v1.7.0 2025-09-07T07:46:33.9047761Z * [new tag] v1.7.0-rc1 -> v1.7.0-rc1 2025-09-07T07:46:33.9047904Z * [new tag] v1.7.0-rc2 -> v1.7.0-rc2 2025-09-07T07:46:33.9048043Z * [new tag] v1.7.0-rc3 -> v1.7.0-rc3 2025-09-07T07:46:33.9048191Z * [new tag] v1.7.0-rc4 -> v1.7.0-rc4 2025-09-07T07:46:33.9048324Z * [new tag] v1.7.1 -> v1.7.1 2025-09-07T07:46:33.9048465Z * [new tag] v1.7.1-rc1 -> v1.7.1-rc1 2025-09-07T07:46:33.9048614Z * [new tag] v1.7.1-rc2 -> v1.7.1-rc2 2025-09-07T07:46:33.9048750Z * [new tag] v1.7.1-rc3 -> v1.7.1-rc3 2025-09-07T07:46:33.9048886Z * [new tag] v1.8.0 -> v1.8.0 2025-09-07T07:46:33.9049022Z * [new tag] v1.8.0-rc1 -> v1.8.0-rc1 2025-09-07T07:46:33.9049158Z * [new tag] v1.8.0-rc2 -> v1.8.0-rc2 2025-09-07T07:46:33.9049300Z * [new tag] v1.8.0-rc3 -> v1.8.0-rc3 2025-09-07T07:46:33.9049439Z * [new tag] v1.8.0-rc4 -> v1.8.0-rc4 2025-09-07T07:46:33.9049583Z * [new tag] v1.8.0-rc5 -> v1.8.0-rc5 2025-09-07T07:46:33.9049712Z * [new tag] v1.8.1 -> v1.8.1 2025-09-07T07:46:33.9049934Z * [new tag] v1.8.1-rc1 -> v1.8.1-rc1 2025-09-07T07:46:33.9050073Z * [new tag] v1.8.1-rc2 -> v1.8.1-rc2 2025-09-07T07:46:33.9050209Z * [new tag] v1.8.1-rc3 -> v1.8.1-rc3 2025-09-07T07:46:33.9050342Z * [new tag] v1.8.2 -> v1.8.2 2025-09-07T07:46:33.9050479Z * [new tag] v1.8.2-rc1 -> v1.8.2-rc1 2025-09-07T07:46:33.9050613Z * [new tag] v1.9.0 -> v1.9.0 2025-09-07T07:46:33.9050749Z * [new tag] v1.9.0-rc1 -> v1.9.0-rc1 2025-09-07T07:46:33.9050884Z * [new tag] v1.9.0-rc2 -> v1.9.0-rc2 2025-09-07T07:46:33.9051026Z * [new tag] v1.9.0-rc3 -> v1.9.0-rc3 2025-09-07T07:46:33.9051165Z * [new tag] v1.9.0-rc4 -> v1.9.0-rc4 2025-09-07T07:46:33.9051298Z * [new tag] v1.9.1 -> v1.9.1 2025-09-07T07:46:33.9051437Z * [new tag] v1.9.1-rc1 -> v1.9.1-rc1 2025-09-07T07:46:33.9051573Z * [new tag] v1.9.1-rc2 -> v1.9.1-rc2 2025-09-07T07:46:33.9051711Z * [new tag] v2.0.0 -> v2.0.0 2025-09-07T07:46:33.9051847Z * [new tag] v2.0.0-rc1 -> v2.0.0-rc1 2025-09-07T07:46:33.9051993Z * [new tag] v2.0.0-rc2 -> v2.0.0-rc2 2025-09-07T07:46:33.9052131Z * [new tag] v2.0.0-rc3 -> v2.0.0-rc3 2025-09-07T07:46:33.9052275Z * [new tag] v2.0.0-rc4 -> v2.0.0-rc4 2025-09-07T07:46:33.9052411Z * [new tag] v2.0.0-rc5 -> v2.0.0-rc5 2025-09-07T07:46:33.9052638Z * [new tag] v2.0.0-rc6 -> v2.0.0-rc6 2025-09-07T07:46:33.9052774Z * [new tag] v2.0.1 -> v2.0.1 2025-09-07T07:46:33.9052914Z * [new tag] v2.0.1-rc1 -> v2.0.1-rc1 2025-09-07T07:46:33.9053057Z * [new tag] v2.0.1-rc2 -> v2.0.1-rc2 2025-09-07T07:46:33.9053193Z * [new tag] v2.0.1-rc3 -> v2.0.1-rc3 2025-09-07T07:46:33.9053329Z * [new tag] v2.0.1-rc4 -> v2.0.1-rc4 2025-09-07T07:46:33.9053465Z * [new tag] v2.1.0 -> v2.1.0 2025-09-07T07:46:33.9053602Z * [new tag] v2.1.0-rc1 -> v2.1.0-rc1 2025-09-07T07:46:33.9053747Z * [new tag] v2.1.0-rc2 -> v2.1.0-rc2 2025-09-07T07:46:33.9053883Z * [new tag] v2.1.0-rc3 -> v2.1.0-rc3 2025-09-07T07:46:33.9054022Z * [new tag] v2.1.0-rc4 -> v2.1.0-rc4 2025-09-07T07:46:33.9054165Z * [new tag] v2.1.0-rc5 -> v2.1.0-rc5 2025-09-07T07:46:33.9054303Z * [new tag] v2.1.0-rc6 -> v2.1.0-rc6 2025-09-07T07:46:33.9054443Z * [new tag] v2.1.1 -> v2.1.1 2025-09-07T07:46:33.9054577Z * [new tag] v2.1.1-rc1 -> v2.1.1-rc1 2025-09-07T07:46:33.9054718Z * [new tag] v2.1.1-rc2 -> v2.1.1-rc2 2025-09-07T07:46:33.9054855Z * [new tag] v2.1.1-rc3 -> v2.1.1-rc3 2025-09-07T07:46:33.9054990Z * [new tag] v2.1.1-rc4 -> v2.1.1-rc4 2025-09-07T07:46:33.9055135Z * [new tag] v2.1.1-rc5 -> v2.1.1-rc5 2025-09-07T07:46:33.9055272Z * [new tag] v2.1.1-rc6 -> v2.1.1-rc6 2025-09-07T07:46:33.9055409Z * [new tag] v2.1.2 -> v2.1.2 2025-09-07T07:46:33.9055548Z * [new tag] v2.1.2-rc1 -> v2.1.2-rc1 2025-09-07T07:46:33.9055684Z * [new tag] v2.1.2-rc2 -> v2.1.2-rc2 2025-09-07T07:46:33.9055923Z * [new tag] v2.1.2-rc3 -> v2.1.2-rc3 2025-09-07T07:46:33.9056054Z * [new tag] v2.2.0 -> v2.2.0 2025-09-07T07:46:33.9056202Z * [new tag] v2.2.0-rc1 -> v2.2.0-rc1 2025-09-07T07:46:33.9056336Z * [new tag] v2.2.0-rc2 -> v2.2.0-rc2 2025-09-07T07:46:33.9056475Z * [new tag] v2.2.0-rc3 -> v2.2.0-rc3 2025-09-07T07:46:33.9056622Z * [new tag] v2.2.0-rc4 -> v2.2.0-rc4 2025-09-07T07:46:33.9056757Z * [new tag] v2.2.0-rc5 -> v2.2.0-rc5 2025-09-07T07:46:33.9056905Z * [new tag] v2.2.0-rc6 -> v2.2.0-rc6 2025-09-07T07:46:33.9057047Z * [new tag] v2.2.0-rc7 -> v2.2.0-rc7 2025-09-07T07:46:33.9057187Z * [new tag] v2.2.0-rc8 -> v2.2.0-rc8 2025-09-07T07:46:33.9057423Z * [new tag] v2.2.1 -> v2.2.1 2025-09-07T07:46:33.9057566Z * [new tag] v2.2.1-rc1 -> v2.2.1-rc1 2025-09-07T07:46:33.9057712Z * [new tag] v2.2.1-rc2 -> v2.2.1-rc2 2025-09-07T07:46:33.9057848Z * [new tag] v2.2.1-rc3 -> v2.2.1-rc3 2025-09-07T07:46:33.9057991Z * [new tag] v2.2.2 -> v2.2.2 2025-09-07T07:46:33.9058125Z * [new tag] v2.2.2-rc1 -> v2.2.2-rc1 2025-09-07T07:46:33.9058262Z * [new tag] v2.2.2-rc2 -> v2.2.2-rc2 2025-09-07T07:46:33.9058410Z * [new tag] v2.2.2-rc3 -> v2.2.2-rc3 2025-09-07T07:46:33.9058636Z * [new tag] v2.3.0 -> v2.3.0 2025-09-07T07:46:33.9058784Z * [new tag] v2.3.0-rc1 -> v2.3.0-rc1 2025-09-07T07:46:33.9058929Z * [new tag] v2.3.0-rc10 -> v2.3.0-rc10 2025-09-07T07:46:33.9059070Z * [new tag] v2.3.0-rc11 -> v2.3.0-rc11 2025-09-07T07:46:33.9059217Z * [new tag] v2.3.0-rc12 -> v2.3.0-rc12 2025-09-07T07:46:33.9059357Z * [new tag] v2.3.0-rc2 -> v2.3.0-rc2 2025-09-07T07:46:33.9059503Z * [new tag] v2.3.0-rc3 -> v2.3.0-rc3 2025-09-07T07:46:33.9059640Z * [new tag] v2.3.0-rc4 -> v2.3.0-rc4 2025-09-07T07:46:33.9059778Z * [new tag] v2.3.0-rc5 -> v2.3.0-rc5 2025-09-07T07:46:33.9059922Z * [new tag] v2.3.0-rc6 -> v2.3.0-rc6 2025-09-07T07:46:33.9060059Z * [new tag] v2.3.0-rc7 -> v2.3.0-rc7 2025-09-07T07:46:33.9060209Z * [new tag] v2.3.0-rc8 -> v2.3.0-rc8 2025-09-07T07:46:33.9060350Z * [new tag] v2.3.0-rc9 -> v2.3.0-rc9 2025-09-07T07:46:33.9060491Z * [new tag] v2.3.1 -> v2.3.1 2025-09-07T07:46:33.9060627Z * [new tag] v2.3.1-rc1 -> v2.3.1-rc1 2025-09-07T07:46:33.9060763Z * [new tag] v2.3.1-rc2 -> v2.3.1-rc2 2025-09-07T07:46:33.9060905Z * [new tag] v2.3.1-rc3 -> v2.3.1-rc3 2025-09-07T07:46:33.9061035Z * [new tag] v2.4.0 -> v2.4.0 2025-09-07T07:46:33.9061177Z * [new tag] v2.4.0-rc1 -> v2.4.0-rc1 2025-09-07T07:46:33.9061322Z * [new tag] v2.4.0-rc2 -> v2.4.0-rc2 2025-09-07T07:46:33.9061462Z * [new tag] v2.4.0-rc3 -> v2.4.0-rc3 2025-09-07T07:46:33.9061619Z * [new tag] v2.4.0-rc4 -> v2.4.0-rc4 2025-09-07T07:46:33.9061759Z * [new tag] v2.4.0-rc5 -> v2.4.0-rc5 2025-09-07T07:46:33.9061994Z * [new tag] v2.4.0-rc6 -> v2.4.0-rc6 2025-09-07T07:46:33.9062139Z * [new tag] v2.4.0-rc7 -> v2.4.0-rc7 2025-09-07T07:46:33.9062282Z * [new tag] v2.4.0-rc8 -> v2.4.0-rc8 2025-09-07T07:46:33.9062435Z * [new tag] v2.4.0-rc9 -> v2.4.0-rc9 2025-09-07T07:46:33.9062570Z * [new tag] v2.4.1 -> v2.4.1 2025-09-07T07:46:33.9062723Z * [new tag] v2.4.1-rc1 -> v2.4.1-rc1 2025-09-07T07:46:33.9062866Z * [new tag] v2.4.1-rc2 -> v2.4.1-rc2 2025-09-07T07:46:33.9063019Z * [new tag] v2.4.1-rc3 -> v2.4.1-rc3 2025-09-07T07:46:33.9063156Z * [new tag] v2.5.0 -> v2.5.0 2025-09-07T07:46:33.9063296Z * [new tag] v2.5.0-rc1 -> v2.5.0-rc1 2025-09-07T07:46:33.9063460Z * [new tag] v2.5.0-rc10 -> v2.5.0-rc10 2025-09-07T07:46:33.9063601Z * [new tag] v2.5.0-rc2 -> v2.5.0-rc2 2025-09-07T07:46:33.9063756Z * [new tag] v2.5.0-rc3 -> v2.5.0-rc3 2025-09-07T07:46:33.9063897Z * [new tag] v2.5.0-rc4 -> v2.5.0-rc4 2025-09-07T07:46:33.9064037Z * [new tag] v2.5.0-rc5 -> v2.5.0-rc5 2025-09-07T07:46:33.9064190Z * [new tag] v2.5.0-rc6 -> v2.5.0-rc6 2025-09-07T07:46:33.9064331Z * [new tag] v2.5.0-rc7 -> v2.5.0-rc7 2025-09-07T07:46:33.9064487Z * [new tag] v2.5.0-rc8 -> v2.5.0-rc8 2025-09-07T07:46:33.9064714Z * [new tag] v2.5.0-rc9 -> v2.5.0-rc9 2025-09-07T07:46:33.9064851Z * [new tag] v2.5.1 -> v2.5.1 2025-09-07T07:46:33.9065007Z * [new tag] v2.5.1-rc1 -> v2.5.1-rc1 2025-09-07T07:46:33.9065143Z * [new tag] v2.6.0 -> v2.6.0 2025-09-07T07:46:33.9065300Z * [new tag] v2.6.0-rc1 -> v2.6.0-rc1 2025-09-07T07:46:33.9065441Z * [new tag] v2.6.0-rc2 -> v2.6.0-rc2 2025-09-07T07:46:33.9065594Z * [new tag] v2.6.0-rc3 -> v2.6.0-rc3 2025-09-07T07:46:33.9065736Z * [new tag] v2.6.0-rc4 -> v2.6.0-rc4 2025-09-07T07:46:33.9065878Z * [new tag] v2.6.0-rc5 -> v2.6.0-rc5 2025-09-07T07:46:33.9066034Z * [new tag] v2.6.0-rc6 -> v2.6.0-rc6 2025-09-07T07:46:33.9066176Z * [new tag] v2.6.0-rc7 -> v2.6.0-rc7 2025-09-07T07:46:33.9066335Z * [new tag] v2.6.0-rc8 -> v2.6.0-rc8 2025-09-07T07:46:33.9066479Z * [new tag] v2.6.0-rc9 -> v2.6.0-rc9 2025-09-07T07:46:33.9066617Z * [new tag] v2.7.0 -> v2.7.0 2025-09-07T07:46:33.9066771Z * [new tag] v2.7.0-rc1 -> v2.7.0-rc1 2025-09-07T07:46:33.9066917Z * [new tag] v2.7.0-rc10 -> v2.7.0-rc10 2025-09-07T07:46:33.9067074Z * [new tag] v2.7.0-rc2 -> v2.7.0-rc2 2025-09-07T07:46:33.9067215Z * [new tag] v2.7.0-rc3 -> v2.7.0-rc3 2025-09-07T07:46:33.9067359Z * [new tag] v2.7.0-rc4 -> v2.7.0-rc4 2025-09-07T07:46:33.9067515Z * [new tag] v2.7.0-rc5 -> v2.7.0-rc5 2025-09-07T07:46:33.9067657Z * [new tag] v2.7.0-rc6 -> v2.7.0-rc6 2025-09-07T07:46:33.9067813Z * [new tag] v2.7.0-rc7 -> v2.7.0-rc7 2025-09-07T07:46:33.9067954Z * [new tag] v2.7.0-rc8 -> v2.7.0-rc8 2025-09-07T07:46:33.9068197Z * [new tag] v2.7.0-rc9 -> v2.7.0-rc9 2025-09-07T07:46:33.9068335Z * [new tag] v2.7.1 -> v2.7.1 2025-09-07T07:46:33.9068479Z * [new tag] v2.7.1-rc1 -> v2.7.1-rc1 2025-09-07T07:46:33.9068633Z * [new tag] v2.7.1-rc2 -> v2.7.1-rc2 2025-09-07T07:46:33.9068775Z * [new tag] v2.7.1-rc3 -> v2.7.1-rc3 2025-09-07T07:46:33.9068930Z * [new tag] v2.7.1-rc4 -> v2.7.1-rc4 2025-09-07T07:46:33.9069071Z * [new tag] v2.7.1-rc5 -> v2.7.1-rc5 2025-09-07T07:46:33.9069207Z * [new tag] v2.8.0 -> v2.8.0 2025-09-07T07:46:33.9069369Z * [new tag] v2.8.0-rc1 -> v2.8.0-rc1 2025-09-07T07:46:33.9069511Z * [new tag] v2.8.0-rc2 -> v2.8.0-rc2 2025-09-07T07:46:33.9069672Z * [new tag] v2.8.0-rc3 -> v2.8.0-rc3 2025-09-07T07:46:33.9069816Z * [new tag] v2.8.0-rc4 -> v2.8.0-rc4 2025-09-07T07:46:33.9069959Z * [new tag] v2.8.0-rc5 -> v2.8.0-rc5 2025-09-07T07:46:33.9070112Z * [new tag] v2.8.0-rc6 -> v2.8.0-rc6 2025-09-07T07:46:33.9070254Z * [new tag] v2.8.0-rc7 -> v2.8.0-rc7 2025-09-07T07:46:33.9070409Z * [new tag] v2.8.0-rc8 -> v2.8.0-rc8 2025-09-07T07:46:33.9070563Z * [new tag] whc_flight_1 -> whc_flight_1 2025-09-07T07:46:33.9070729Z * [new tag] whc_flight_2 -> whc_flight_2 2025-09-07T07:46:33.9070876Z * [new tag] whc_flight_4 -> whc_flight_4 2025-09-07T07:46:35.4866443Z [command]/usr/bin/git rev-parse --verify --quiet 93fb23d6fae7c4e82c4239a1033e522088742634^{object} 2025-09-07T07:46:35.4901119Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:46:35.4905534Z ##[endgroup] 2025-09-07T07:46:35.4906010Z ##[group]Determining the checkout info 2025-09-07T07:46:35.4906584Z ##[endgroup] 2025-09-07T07:46:35.4910026Z [command]/usr/bin/git sparse-checkout disable 2025-09-07T07:46:35.4957141Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-09-07T07:46:35.4984136Z ##[group]Checking out the ref 2025-09-07T07:46:35.4987812Z [command]/usr/bin/git checkout --progress --force 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:46:36.5573211Z Updating files: 23% (4465/19405) 2025-09-07T07:46:36.5710285Z Updating files: 24% (4658/19405) 2025-09-07T07:46:36.5768389Z Updating files: 25% (4852/19405) 2025-09-07T07:46:36.5798648Z Updating files: 26% (5046/19405) 2025-09-07T07:46:36.5830193Z Updating files: 27% (5240/19405) 2025-09-07T07:46:36.5870355Z Updating files: 28% (5434/19405) 2025-09-07T07:46:36.5904823Z Updating files: 29% (5628/19405) 2025-09-07T07:46:36.5936639Z Updating files: 30% (5822/19405) 2025-09-07T07:46:36.6034111Z Updating files: 31% (6016/19405) 2025-09-07T07:46:36.6146574Z Updating files: 32% (6210/19405) 2025-09-07T07:46:36.6288668Z Updating files: 33% (6404/19405) 2025-09-07T07:46:36.6529089Z Updating files: 34% (6598/19405) 2025-09-07T07:46:36.6749037Z Updating files: 35% (6792/19405) 2025-09-07T07:46:36.6780499Z Updating files: 36% (6986/19405) 2025-09-07T07:46:36.6811455Z Updating files: 37% (7180/19405) 2025-09-07T07:46:36.6842024Z Updating files: 38% (7374/19405) 2025-09-07T07:46:36.6873612Z Updating files: 39% (7568/19405) 2025-09-07T07:46:36.6905845Z Updating files: 40% (7762/19405) 2025-09-07T07:46:36.6936975Z Updating files: 41% (7957/19405) 2025-09-07T07:46:36.6967938Z Updating files: 42% (8151/19405) 2025-09-07T07:46:36.6999930Z Updating files: 43% (8345/19405) 2025-09-07T07:46:36.7030615Z Updating files: 44% (8539/19405) 2025-09-07T07:46:36.7061113Z Updating files: 45% (8733/19405) 2025-09-07T07:46:36.7091219Z Updating files: 46% (8927/19405) 2025-09-07T07:46:36.7120333Z Updating files: 47% (9121/19405) 2025-09-07T07:46:36.7148778Z Updating files: 48% (9315/19405) 2025-09-07T07:46:36.7176848Z Updating files: 49% (9509/19405) 2025-09-07T07:46:37.3824312Z Updating files: 50% (9703/19405) 2025-09-07T07:46:37.3856384Z Updating files: 51% (9897/19405) 2025-09-07T07:46:37.3889774Z Updating files: 52% (10091/19405) 2025-09-07T07:46:37.3926096Z Updating files: 53% (10285/19405) 2025-09-07T07:46:37.3958110Z Updating files: 54% (10479/19405) 2025-09-07T07:46:37.3993099Z Updating files: 55% (10673/19405) 2025-09-07T07:46:37.4028593Z Updating files: 56% (10867/19405) 2025-09-07T07:46:37.4062472Z Updating files: 57% (11061/19405) 2025-09-07T07:46:37.4098524Z Updating files: 58% (11255/19405) 2025-09-07T07:46:37.4131601Z Updating files: 59% (11449/19405) 2025-09-07T07:46:37.4166798Z Updating files: 60% (11643/19405) 2025-09-07T07:46:37.4199481Z Updating files: 61% (11838/19405) 2025-09-07T07:46:37.4234052Z Updating files: 62% (12032/19405) 2025-09-07T07:46:37.4268246Z Updating files: 63% (12226/19405) 2025-09-07T07:46:37.4302539Z Updating files: 64% (12420/19405) 2025-09-07T07:46:37.4337514Z Updating files: 65% (12614/19405) 2025-09-07T07:46:37.4370546Z Updating files: 66% (12808/19405) 2025-09-07T07:46:37.4402955Z Updating files: 67% (13002/19405) 2025-09-07T07:46:37.4436511Z Updating files: 68% (13196/19405) 2025-09-07T07:46:37.4616520Z Updating files: 69% (13390/19405) 2025-09-07T07:46:38.4765997Z Updating files: 70% (13584/19405) 2025-09-07T07:46:38.4945318Z Updating files: 70% (13658/19405) 2025-09-07T07:46:38.4981656Z Updating files: 71% (13778/19405) 2025-09-07T07:46:38.5049013Z Updating files: 72% (13972/19405) 2025-09-07T07:46:38.5212005Z Updating files: 73% (14166/19405) 2025-09-07T07:46:38.5260097Z Updating files: 73% (14330/19405) 2025-09-07T07:46:38.5469819Z Updating files: 74% (14360/19405) 2025-09-07T07:46:38.6023719Z Updating files: 75% (14554/19405) 2025-09-07T07:46:38.6163148Z Updating files: 76% (14748/19405) 2025-09-07T07:46:38.6268143Z Updating files: 77% (14942/19405) 2025-09-07T07:46:38.6515180Z Updating files: 78% (15136/19405) 2025-09-07T07:46:38.6747602Z Updating files: 79% (15330/19405) 2025-09-07T07:46:38.7059714Z Updating files: 80% (15524/19405) 2025-09-07T07:46:38.7371125Z Updating files: 81% (15719/19405) 2025-09-07T07:46:38.7595469Z Updating files: 82% (15913/19405) 2025-09-07T07:46:38.7705049Z Updating files: 83% (16107/19405) 2025-09-07T07:46:38.7835879Z Updating files: 84% (16301/19405) 2025-09-07T07:46:38.7992856Z Updating files: 85% (16495/19405) 2025-09-07T07:46:38.8127008Z Updating files: 86% (16689/19405) 2025-09-07T07:46:38.8256219Z Updating files: 87% (16883/19405) 2025-09-07T07:46:38.8349411Z Updating files: 88% (17077/19405) 2025-09-07T07:46:38.8482390Z Updating files: 89% (17271/19405) 2025-09-07T07:46:38.8650361Z Updating files: 90% (17465/19405) 2025-09-07T07:46:38.8752682Z Updating files: 91% (17659/19405) 2025-09-07T07:46:38.8889993Z Updating files: 92% (17853/19405) 2025-09-07T07:46:38.9068662Z Updating files: 93% (18047/19405) 2025-09-07T07:46:39.7997790Z Updating files: 94% (18241/19405) 2025-09-07T07:46:39.8152693Z Updating files: 94% (18311/19405) 2025-09-07T07:46:39.8579400Z Updating files: 95% (18435/19405) 2025-09-07T07:46:39.8732413Z Updating files: 96% (18629/19405) 2025-09-07T07:46:39.8910143Z Updating files: 97% (18823/19405) 2025-09-07T07:46:39.9175080Z Updating files: 98% (19017/19405) 2025-09-07T07:46:39.9323772Z Updating files: 99% (19211/19405) 2025-09-07T07:46:39.9324090Z Updating files: 100% (19405/19405) 2025-09-07T07:46:39.9324412Z Updating files: 100% (19405/19405), done. 2025-09-07T07:46:39.9596723Z Note: switching to '93fb23d6fae7c4e82c4239a1033e522088742634'. 2025-09-07T07:46:39.9597059Z 2025-09-07T07:46:39.9597340Z You are in 'detached HEAD' state. You can look around, make experimental 2025-09-07T07:46:39.9597902Z changes and commit them, and you can discard any commits you make in this 2025-09-07T07:46:39.9598735Z state without impacting any branches by switching back to a branch. 2025-09-07T07:46:39.9599082Z 2025-09-07T07:46:39.9599297Z If you want to create a new branch to retain commits you create, you may 2025-09-07T07:46:39.9599817Z do so (now or later) by using -c with the switch command. Example: 2025-09-07T07:46:39.9600117Z 2025-09-07T07:46:39.9600246Z git switch -c 2025-09-07T07:46:39.9600447Z 2025-09-07T07:46:39.9600570Z Or undo this operation with: 2025-09-07T07:46:39.9600753Z 2025-09-07T07:46:39.9600843Z git switch - 2025-09-07T07:46:39.9600992Z 2025-09-07T07:46:39.9601238Z Turn off this advice by setting config variable advice.detachedHead to false 2025-09-07T07:46:39.9601604Z 2025-09-07T07:46:39.9601784Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:46:39.9708242Z ##[endgroup] 2025-09-07T07:46:39.9708673Z ##[group]Setting up auth for fetching submodules 2025-09-07T07:46:39.9714451Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:46:39.9761077Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-09-07T07:46:39.9788829Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-09-07T07:46:39.9814784Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-09-07T07:46:39.9837904Z ##[endgroup] 2025-09-07T07:46:39.9838299Z ##[group]Fetching submodules 2025-09-07T07:46:39.9841194Z [command]/usr/bin/git submodule sync --recursive 2025-09-07T07:46:40.0083464Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-09-07T07:46:40.0319043Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2025-09-07T07:46:40.0328998Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2025-09-07T07:46:40.0340335Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2025-09-07T07:46:40.0351746Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2025-09-07T07:46:40.0362306Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2025-09-07T07:46:40.0373691Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2025-09-07T07:46:40.0384419Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2025-09-07T07:46:40.0395303Z Submodule 'third_party/aiter' (https://github.com/ROCm/aiter.git) registered for path 'third_party/aiter' 2025-09-07T07:46:40.0406321Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2025-09-07T07:46:40.0417707Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2025-09-07T07:46:40.0429036Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2025-09-07T07:46:40.0439872Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2025-09-07T07:46:40.0450824Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2025-09-07T07:46:40.0461927Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2025-09-07T07:46:40.0473084Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2025-09-07T07:46:40.0485066Z Submodule 'third_party/flash-attention' (https://github.com/Dao-AILab/flash-attention.git) registered for path 'third_party/flash-attention' 2025-09-07T07:46:40.0496142Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2025-09-07T07:46:40.0507706Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2025-09-07T07:46:40.0518754Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:46:40.0530205Z Submodule 'third_party/gloo' (https://github.com/pytorch/gloo) registered for path 'third_party/gloo' 2025-09-07T07:46:40.0542011Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2025-09-07T07:46:40.0553471Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2025-09-07T07:46:40.0565262Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2025-09-07T07:46:40.0577716Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2025-09-07T07:46:40.0589304Z Submodule 'third_party/kleidiai' (https://github.com/ARM-software/kleidiai.git) registered for path 'third_party/kleidiai' 2025-09-07T07:46:40.0601084Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2025-09-07T07:46:40.0612405Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2025-09-07T07:46:40.0624149Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2025-09-07T07:46:40.0636346Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2025-09-07T07:46:40.0647816Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2025-09-07T07:46:40.0659176Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2025-09-07T07:46:40.0671736Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2025-09-07T07:46:40.0684243Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2025-09-07T07:46:40.0696275Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2025-09-07T07:46:40.0708557Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2025-09-07T07:46:40.0721131Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2025-09-07T07:46:40.0734852Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2025-09-07T07:46:40.0770948Z Cloning into '/home/charlie/_work/pytorch/pytorch/android/libs/fbjni'... 2025-09-07T07:46:40.3340296Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/FP16'... 2025-09-07T07:46:40.5423006Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/FXdiv'... 2025-09-07T07:46:40.7499708Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/NNPACK'... 2025-09-07T07:46:41.0368887Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/NVTX'... 2025-09-07T07:46:41.4614921Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2025-09-07T07:46:42.7255950Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/XNNPACK'... 2025-09-07T07:47:00.9821259Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/aiter'... 2025-09-07T07:47:09.0059651Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/benchmark'... 2025-09-07T07:47:09.8044834Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/composable_kernel'... 2025-09-07T07:47:15.3872029Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/cpp-httplib'... 2025-09-07T07:47:16.3993736Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/cpuinfo'... 2025-09-07T07:47:17.4664669Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2025-09-07T07:47:19.8332240Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/cutlass'... 2025-09-07T07:47:22.3546503Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm'... 2025-09-07T07:47:24.2540663Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/flash-attention'... 2025-09-07T07:47:24.8015702Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/flatbuffers'... 2025-09-07T07:47:26.0976360Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fmt'... 2025-09-07T07:47:27.7291473Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2025-09-07T07:47:28.1834764Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/gloo'... 2025-09-07T07:47:28.5882454Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/googletest'... 2025-09-07T07:47:29.5414026Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/ideep'... 2025-09-07T07:47:29.8811470Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/ittapi'... 2025-09-07T07:47:30.3039620Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto'... 2025-09-07T07:47:33.0185155Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kleidiai'... 2025-09-07T07:47:33.5099224Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/mimalloc'... 2025-09-07T07:47:34.3723918Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/nlohmann'... 2025-09-07T07:47:45.4473012Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/onnx'... 2025-09-07T07:47:48.3919933Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2025-09-07T07:47:56.6039218Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/pocketfft'... 2025-09-07T07:47:56.8955192Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/protobuf'... 2025-09-07T07:48:07.8469017Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/psimd'... 2025-09-07T07:48:08.8497575Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/pthreadpool'... 2025-09-07T07:48:10.2139665Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/pybind11'... 2025-09-07T07:48:11.5132862Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/python-peachpy'... 2025-09-07T07:48:11.8308041Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/sleef'... 2025-09-07T07:48:12.6762620Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe'... 2025-09-07T07:48:13.1377319Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-09-07T07:48:13.1498890Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-09-07T07:48:13.1617474Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-09-07T07:48:13.1889550Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-09-07T07:48:13.2696224Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-09-07T07:48:13.3296754Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-09-07T07:48:14.0759322Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-09-07T07:48:14.2405060Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-09-07T07:48:14.2437570Z Submodule '3rdparty/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:48:14.2464675Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/aiter/3rdparty/composable_kernel'... 2025-09-07T07:48:17.9662634Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-09-07T07:48:17.9937816Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-09-07T07:48:18.3393422Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-09-07T07:48:18.4421155Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-09-07T07:48:18.6026444Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-09-07T07:48:18.7031523Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-09-07T07:48:19.6205537Z Submodule path 'third_party/cutlass': checked out 'e51efbfe18fe4f4cbb66ab814c55bf4aa0185491' 2025-09-07T07:48:19.9633544Z Submodule path 'third_party/fbgemm': checked out '4b39c551efe15e6bbade20565b0ceb2d8ce3352d' 2025-09-07T07:48:19.9931571Z Submodule 'external/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/external/asmjit' 2025-09-07T07:48:20.0601599Z Submodule 'external/composable_kernel' (https://github.com/jwfromm/composable_kernel.git) registered for path 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:48:20.1122149Z Submodule 'external/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:48:20.1978446Z Submodule 'external/cutlass' (https://github.com/jwfromm/cutlass) registered for path 'third_party/fbgemm/external/cutlass' 2025-09-07T07:48:20.1984544Z Submodule 'external/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/external/googletest' 2025-09-07T07:48:20.1990482Z Submodule 'external/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:48:20.1996839Z Submodule 'external/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/fbgemm/external/json' 2025-09-07T07:48:20.2025864Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/asmjit'... 2025-09-07T07:48:21.3663980Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/composable_kernel'... 2025-09-07T07:48:22.6589443Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/cpuinfo'... 2025-09-07T07:48:24.1660972Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/cutlass'... 2025-09-07T07:48:27.4728355Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/googletest'... 2025-09-07T07:48:28.9799759Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/hipify_torch'... 2025-09-07T07:48:29.6545996Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/fbgemm/external/json'... 2025-09-07T07:48:42.8283680Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-09-07T07:48:43.4094599Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out 'b1281b8b08d973a7064f864f47eeb30f3e2596e9' 2025-09-07T07:48:43.5338968Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-09-07T07:48:44.2309294Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-09-07T07:48:44.4050085Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:48:44.4394294Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-09-07T07:48:44.5677140Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-09-07T07:48:44.6529379Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-09-07T07:48:44.6955675Z Submodule 'csrc/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:48:44.7267891Z Submodule 'csrc/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:48:44.7295727Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/flash-attention/csrc/composable_kernel'... 2025-09-07T07:48:50.2566510Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/flash-attention/csrc/cutlass'... 2025-09-07T07:48:55.5909877Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-09-07T07:48:56.2801510Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-09-07T07:48:56.4333275Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-09-07T07:48:56.4925159Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-09-07T07:48:56.5612035Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-09-07T07:48:56.6116785Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-09-07T07:48:56.6841276Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:48:56.7216194Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-09-07T07:48:56.7531858Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2025-09-07T07:48:56.7556082Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2025-09-07T07:49:16.2714946Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-09-07T07:49:16.3180758Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-09-07T07:49:16.4516490Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-09-07T07:49:16.5650506Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:49:16.7067381Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:49:16.8522779Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:49:16.8972093Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2025-09-07T07:49:18.5770951Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2025-09-07T07:49:20.7270701Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2025-09-07T07:49:22.2479664Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-09-07T07:49:22.2780086Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:49:22.3028364Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:49:22.3466880Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:49:22.3684301Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:49:22.3896097Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:49:22.4846374Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:49:22.5481911Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:49:22.5767159Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:49:22.5797507Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2025-09-07T07:49:25.3653933Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2025-09-07T07:49:26.7244071Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2025-09-07T07:49:30.4175048Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2025-09-07T07:49:31.6348663Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2025-09-07T07:49:32.6888247Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2025-09-07T07:49:34.1155920Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2025-09-07T07:49:51.2467991Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2025-09-07T07:49:52.8531057Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-09-07T07:49:52.9616619Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-09-07T07:49:53.1954134Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-09-07T07:49:53.3151618Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-09-07T07:49:53.4085379Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:49:53.4336183Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2025-09-07T07:49:54.3118813Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-09-07T07:49:54.4476153Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-09-07T07:49:54.5128877Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-09-07T07:49:54.7940620Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-09-07T07:49:54.8231289Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-09-07T07:49:54.9877510Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-09-07T07:49:55.1626773Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-09-07T07:49:55.2677923Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-09-07T07:49:55.3293883Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-09-07T07:49:55.4570963Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-09-07T07:49:56.3022066Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-09-07T07:49:56.3267415Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2025-09-07T07:49:56.3298007Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2025-09-07T07:49:58.1489487Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-09-07T07:49:58.2292026Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-09-07T07:49:58.2590132Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:49:58.2989752Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:49:58.3250742Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:49:58.3523550Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:49:58.3901690Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:49:58.4082538Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:49:58.4436888Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:49:58.4930891Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:49:58.4960957Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2025-09-07T07:49:59.2408708Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2025-09-07T07:50:01.4667748Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2025-09-07T07:50:02.3679827Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2025-09-07T07:50:17.0995393Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2025-09-07T07:50:17.7286234Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2025-09-07T07:50:18.3148935Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2025-09-07T07:50:18.8960742Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2025-09-07T07:50:27.6677168Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-09-07T07:50:27.7387537Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-09-07T07:50:27.7821889Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-09-07T07:50:27.9175459Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-09-07T07:50:27.9412828Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-09-07T07:50:27.9819643Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-09-07T07:50:28.0260979Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-09-07T07:50:28.0581788Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:28.1015755Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:28.1042237Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-09-07T07:50:30.8013763Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2025-09-07T07:50:32.6922004Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-09-07T07:50:32.7944954Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-09-07T07:50:33.4411945Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-09-07T07:50:33.4638433Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-09-07T07:50:33.9159369Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-09-07T07:50:33.9197044Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:33.9207737Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:33.9235042Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2025-09-07T07:50:34.3772309Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2025-09-07T07:50:35.3268370Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-09-07T07:50:35.3997404Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-09-07T07:50:35.4102391Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-09-07T07:50:35.4227431Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-09-07T07:50:35.4620728Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-09-07T07:50:35.4943312Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-09-07T07:50:35.5904754Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-09-07T07:50:35.6167769Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-09-07T07:50:35.6194241Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:35.6206251Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:35.6215112Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:35.6226354Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:35.6252543Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2025-09-07T07:50:36.6019882Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2025-09-07T07:50:36.8478166Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2025-09-07T07:50:37.9786338Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2025-09-07T07:50:38.9321341Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-09-07T07:50:38.9477613Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-09-07T07:50:39.0221791Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-09-07T07:50:39.0513015Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-09-07T07:50:39.0538040Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:39.0564097Z Cloning into '/home/charlie/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2025-09-07T07:50:39.2978426Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-09-07T07:50:39.3016741Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-09-07T07:50:39.3262929Z Entering 'android/libs/fbjni' 2025-09-07T07:50:39.3300842Z Entering 'third_party/FP16' 2025-09-07T07:50:39.3340959Z Entering 'third_party/FXdiv' 2025-09-07T07:50:39.3378439Z Entering 'third_party/NNPACK' 2025-09-07T07:50:39.3416829Z Entering 'third_party/NVTX' 2025-09-07T07:50:39.3455459Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:39.3493823Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:39.3551018Z Entering 'third_party/aiter' 2025-09-07T07:50:39.3590764Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:39.3637201Z Entering 'third_party/benchmark' 2025-09-07T07:50:39.3680848Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:39.3727966Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:39.3768316Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:39.3805755Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:39.3846623Z Entering 'third_party/cutlass' 2025-09-07T07:50:39.3895564Z Entering 'third_party/fbgemm' 2025-09-07T07:50:39.3937559Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:39.3974969Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:39.4021682Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:39.4059943Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:39.4108047Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:39.4144966Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:39.4181265Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:39.4224499Z Entering 'third_party/flash-attention' 2025-09-07T07:50:39.4265616Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:39.4309260Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:39.4359639Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:39.4401035Z Entering 'third_party/fmt' 2025-09-07T07:50:39.4440707Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:39.4478987Z Entering 'third_party/gloo' 2025-09-07T07:50:39.4519301Z Entering 'third_party/googletest' 2025-09-07T07:50:39.4556935Z Entering 'third_party/ideep' 2025-09-07T07:50:39.4594772Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:39.4640763Z Entering 'third_party/ittapi' 2025-09-07T07:50:39.4680334Z Entering 'third_party/kineto' 2025-09-07T07:50:39.4742906Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:39.4782029Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:39.4854146Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:39.4918289Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:39.4969676Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:39.5006779Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:39.5044545Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:39.5081585Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:39.5119116Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:39.5156569Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:39.5194111Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:39.5232344Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:39.5270402Z Entering 'third_party/kleidiai' 2025-09-07T07:50:39.5312329Z Entering 'third_party/mimalloc' 2025-09-07T07:50:39.5351471Z Entering 'third_party/nlohmann' 2025-09-07T07:50:39.5391892Z Entering 'third_party/onnx' 2025-09-07T07:50:39.5448503Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:39.5488638Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:39.5529885Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:39.5567099Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:39.5603838Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:39.5639480Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:39.5677753Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:39.5713485Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:39.5749725Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:39.5786651Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:39.5824961Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:39.5870534Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:39.5930404Z Entering 'third_party/pocketfft' 2025-09-07T07:50:39.5968597Z Entering 'third_party/protobuf' 2025-09-07T07:50:39.6010692Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:39.6046678Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:39.6086162Z Entering 'third_party/psimd' 2025-09-07T07:50:39.6124602Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:39.6162468Z Entering 'third_party/pybind11' 2025-09-07T07:50:39.6200445Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:39.6240329Z Entering 'third_party/sleef' 2025-09-07T07:50:39.6280879Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:39.6321008Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:39.6356441Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:39.6392643Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:39.6429765Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:39.6465204Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:39.6515428Z ##[endgroup] 2025-09-07T07:50:39.6516135Z ##[group]Persisting credentials for submodules 2025-09-07T07:50:39.6524590Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-09-07T07:50:39.6767429Z Entering 'android/libs/fbjni' 2025-09-07T07:50:39.6811158Z Entering 'third_party/FP16' 2025-09-07T07:50:39.6855272Z Entering 'third_party/FXdiv' 2025-09-07T07:50:39.6897128Z Entering 'third_party/NNPACK' 2025-09-07T07:50:39.6942356Z Entering 'third_party/NVTX' 2025-09-07T07:50:39.6987109Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:39.7032834Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:39.7091941Z Entering 'third_party/aiter' 2025-09-07T07:50:39.7136030Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:39.7188132Z Entering 'third_party/benchmark' 2025-09-07T07:50:39.7231899Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:39.7285830Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:39.7328444Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:39.7372150Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:39.7417260Z Entering 'third_party/cutlass' 2025-09-07T07:50:39.7470122Z Entering 'third_party/fbgemm' 2025-09-07T07:50:39.7516100Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:39.7558546Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:39.7608622Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:39.7650710Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:39.7700184Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:39.7743000Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:39.7784413Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:39.7828984Z Entering 'third_party/flash-attention' 2025-09-07T07:50:39.7874577Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:39.7922400Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:39.7974798Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:39.8020744Z Entering 'third_party/fmt' 2025-09-07T07:50:39.8064639Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:39.8108704Z Entering 'third_party/gloo' 2025-09-07T07:50:39.8152956Z Entering 'third_party/googletest' 2025-09-07T07:50:39.8196535Z Entering 'third_party/ideep' 2025-09-07T07:50:39.8240373Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:39.8290435Z Entering 'third_party/ittapi' 2025-09-07T07:50:39.8334227Z Entering 'third_party/kineto' 2025-09-07T07:50:39.8377530Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:39.8418626Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:39.8460844Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:39.8502905Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:39.8546176Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:39.8587753Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:39.8629897Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:39.8672839Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:39.8715163Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:39.8757763Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:39.8802047Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:39.8843399Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:39.8888548Z Entering 'third_party/kleidiai' 2025-09-07T07:50:39.9080250Z Entering 'third_party/mimalloc' 2025-09-07T07:50:39.9126079Z Entering 'third_party/nlohmann' 2025-09-07T07:50:39.9172766Z Entering 'third_party/onnx' 2025-09-07T07:50:39.9234955Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:39.9279627Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:39.9326657Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:39.9368256Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:39.9409812Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:39.9451394Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:39.9495994Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:39.9537339Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:39.9577543Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:39.9619691Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:39.9663372Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:39.9707304Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:39.9772110Z Entering 'third_party/pocketfft' 2025-09-07T07:50:39.9818334Z Entering 'third_party/protobuf' 2025-09-07T07:50:39.9866018Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:39.9907822Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:39.9952534Z Entering 'third_party/psimd' 2025-09-07T07:50:39.9996547Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:40.0040051Z Entering 'third_party/pybind11' 2025-09-07T07:50:40.0084166Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:40.0127924Z Entering 'third_party/sleef' 2025-09-07T07:50:40.0172176Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:40.0214826Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:40.0257001Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:40.0298881Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:40.0341322Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:40.0382156Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:40.0440547Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-09-07T07:50:40.0680110Z Entering 'android/libs/fbjni' 2025-09-07T07:50:40.0717676Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-09-07T07:50:40.0736038Z Entering 'third_party/FP16' 2025-09-07T07:50:40.0773758Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-09-07T07:50:40.0792055Z Entering 'third_party/FXdiv' 2025-09-07T07:50:40.0830232Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-09-07T07:50:40.0849324Z Entering 'third_party/NNPACK' 2025-09-07T07:50:40.0889974Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-09-07T07:50:40.0907760Z Entering 'third_party/NVTX' 2025-09-07T07:50:40.0945735Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-09-07T07:50:40.0964965Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:40.1002181Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-09-07T07:50:40.1020075Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:40.1058183Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-09-07T07:50:40.1092360Z Entering 'third_party/aiter' 2025-09-07T07:50:40.1130965Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-09-07T07:50:40.1149784Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:40.1189899Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-09-07T07:50:40.1215047Z Entering 'third_party/benchmark' 2025-09-07T07:50:40.1252470Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:40.1270881Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:40.1309741Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-09-07T07:50:40.1336891Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:40.1374529Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-09-07T07:50:40.1392826Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:40.1430998Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-09-07T07:50:40.1450259Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:40.1488139Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-09-07T07:50:40.1506886Z Entering 'third_party/cutlass' 2025-09-07T07:50:40.1545545Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-09-07T07:50:40.1574829Z Entering 'third_party/fbgemm' 2025-09-07T07:50:40.1613919Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-09-07T07:50:40.1635117Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:40.1675513Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-09-07T07:50:40.1693810Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:40.1730948Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-09-07T07:50:40.1756729Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:40.1793778Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-09-07T07:50:40.1812783Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:40.1849991Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-09-07T07:50:40.1878006Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:40.1917748Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-09-07T07:50:40.1935381Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:40.1975973Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-09-07T07:50:40.1992715Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:40.2033020Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-09-07T07:50:40.2054595Z Entering 'third_party/flash-attention' 2025-09-07T07:50:40.2094345Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-09-07T07:50:40.2113697Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:40.2151528Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-09-07T07:50:40.2177316Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:40.2214489Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-09-07T07:50:40.2243310Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:40.2280513Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-09-07T07:50:40.2301943Z Entering 'third_party/fmt' 2025-09-07T07:50:40.2340691Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:50:40.2359358Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:40.2398114Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-09-07T07:50:40.2417135Z Entering 'third_party/gloo' 2025-09-07T07:50:40.2455068Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-09-07T07:50:40.2473671Z Entering 'third_party/googletest' 2025-09-07T07:50:40.2514317Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.2533382Z Entering 'third_party/ideep' 2025-09-07T07:50:40.2573285Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-09-07T07:50:40.2590051Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:40.2626820Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-09-07T07:50:40.2653680Z Entering 'third_party/ittapi' 2025-09-07T07:50:40.2695035Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-09-07T07:50:40.2713547Z Entering 'third_party/kineto' 2025-09-07T07:50:40.2751779Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-09-07T07:50:40.2770601Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:40.2809145Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-09-07T07:50:40.2826070Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:40.2863452Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-09-07T07:50:40.2882002Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:40.2919691Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-09-07T07:50:40.2937328Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:40.2975062Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:50:40.2992691Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:40.3031631Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-09-07T07:50:40.3048419Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:40.3089030Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-09-07T07:50:40.3108091Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:40.3146218Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-09-07T07:50:40.3163866Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:40.3200585Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.3218539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:40.3256548Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-09-07T07:50:40.3275204Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:40.3316304Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-09-07T07:50:40.3336150Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:40.3375462Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-09-07T07:50:40.3393367Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:40.3432276Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.3451785Z Entering 'third_party/kleidiai' 2025-09-07T07:50:40.3490237Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-09-07T07:50:40.3509597Z Entering 'third_party/mimalloc' 2025-09-07T07:50:40.3547904Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-09-07T07:50:40.3566857Z Entering 'third_party/nlohmann' 2025-09-07T07:50:40.3607257Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-09-07T07:50:40.3626994Z Entering 'third_party/onnx' 2025-09-07T07:50:40.3667482Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-09-07T07:50:40.3703826Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:40.3743955Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:40.3765411Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:40.3806188Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-09-07T07:50:40.3825317Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:40.3864838Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:40.3881873Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:40.3920095Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.3937802Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:40.3979443Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-09-07T07:50:40.3997305Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:40.4038258Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-09-07T07:50:40.4058454Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:40.4097846Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-09-07T07:50:40.4115571Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:40.4154547Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-09-07T07:50:40.4174068Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:40.4211551Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-09-07T07:50:40.4228492Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:40.4267611Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-09-07T07:50:40.4287013Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:40.4325824Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-09-07T07:50:40.4345177Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:40.4382484Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-09-07T07:50:40.4423889Z Entering 'third_party/pocketfft' 2025-09-07T07:50:40.4461978Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-09-07T07:50:40.4481220Z Entering 'third_party/protobuf' 2025-09-07T07:50:40.4520629Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-09-07T07:50:40.4543155Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:40.4580862Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:40.4598692Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:40.4636658Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.4658156Z Entering 'third_party/psimd' 2025-09-07T07:50:40.4697909Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-09-07T07:50:40.4716862Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:40.4755535Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-09-07T07:50:40.4773788Z Entering 'third_party/pybind11' 2025-09-07T07:50:40.4816235Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:40.4835449Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:40.4875208Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-09-07T07:50:40.4893215Z Entering 'third_party/sleef' 2025-09-07T07:50:40.4933246Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-09-07T07:50:40.4951573Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:40.4990078Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-09-07T07:50:40.5008649Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:40.5046890Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:40.5064463Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:40.5105217Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-09-07T07:50:40.5122185Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:40.5158977Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-09-07T07:50:40.5177316Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:40.5216872Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:40.5233437Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:40.5270532Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-09-07T07:50:40.5504196Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-09-07T07:50:40.5741236Z Entering 'android/libs/fbjni' 2025-09-07T07:50:40.5778077Z Entering 'third_party/FP16' 2025-09-07T07:50:40.5815859Z Entering 'third_party/FXdiv' 2025-09-07T07:50:40.5854063Z Entering 'third_party/NNPACK' 2025-09-07T07:50:40.5891940Z Entering 'third_party/NVTX' 2025-09-07T07:50:40.5930424Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:40.5971277Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:40.6025817Z Entering 'third_party/aiter' 2025-09-07T07:50:40.6065134Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:40.6115919Z Entering 'third_party/benchmark' 2025-09-07T07:50:40.6154207Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:40.6200520Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:40.6238550Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:40.6277488Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:40.6319721Z Entering 'third_party/cutlass' 2025-09-07T07:50:40.6370407Z Entering 'third_party/fbgemm' 2025-09-07T07:50:40.6411099Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:40.6450862Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:40.6495853Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:40.6533048Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:40.6582474Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:40.6618306Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:40.6654484Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:40.6694148Z Entering 'third_party/flash-attention' 2025-09-07T07:50:40.6733398Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:40.6776811Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:40.6822734Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:40.6863825Z Entering 'third_party/fmt' 2025-09-07T07:50:40.6902549Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:40.6942185Z Entering 'third_party/gloo' 2025-09-07T07:50:40.6979931Z Entering 'third_party/googletest' 2025-09-07T07:50:40.7017588Z Entering 'third_party/ideep' 2025-09-07T07:50:40.7054194Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:40.7097429Z Entering 'third_party/ittapi' 2025-09-07T07:50:40.7135910Z Entering 'third_party/kineto' 2025-09-07T07:50:40.7177204Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:40.7212792Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:40.7250421Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:40.7288084Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:40.7324891Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:40.7360512Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:40.7399212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:40.7437183Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:40.7476126Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:40.7513683Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:40.7552031Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:40.7587265Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:40.7626121Z Entering 'third_party/kleidiai' 2025-09-07T07:50:40.7665268Z Entering 'third_party/mimalloc' 2025-09-07T07:50:40.7703599Z Entering 'third_party/nlohmann' 2025-09-07T07:50:40.7742897Z Entering 'third_party/onnx' 2025-09-07T07:50:40.7798808Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:40.7838583Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:40.7878555Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:40.7914413Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:40.7950309Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:40.7987503Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:40.8024851Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:40.8064078Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:40.8100137Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:40.8135624Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:40.8173468Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:40.8212020Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:40.8270104Z Entering 'third_party/pocketfft' 2025-09-07T07:50:40.8311733Z Entering 'third_party/protobuf' 2025-09-07T07:50:40.8353371Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:40.8389256Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:40.8428751Z Entering 'third_party/psimd' 2025-09-07T07:50:40.8468663Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:40.8506822Z Entering 'third_party/pybind11' 2025-09-07T07:50:40.8544623Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:40.8582999Z Entering 'third_party/sleef' 2025-09-07T07:50:40.8622934Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:40.8660600Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:40.8695454Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:40.8732413Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:40.8769963Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:40.8805601Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:40.8863850Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-09-07T07:50:40.9102122Z Entering 'android/libs/fbjni' 2025-09-07T07:50:40.9139786Z Entering 'third_party/FP16' 2025-09-07T07:50:40.9178083Z Entering 'third_party/FXdiv' 2025-09-07T07:50:40.9215921Z Entering 'third_party/NNPACK' 2025-09-07T07:50:40.9257373Z Entering 'third_party/NVTX' 2025-09-07T07:50:40.9296395Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:40.9337343Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:40.9391955Z Entering 'third_party/aiter' 2025-09-07T07:50:40.9433457Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:40.9479738Z Entering 'third_party/benchmark' 2025-09-07T07:50:40.9518362Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:40.9569524Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:40.9607185Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:40.9647245Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:40.9685813Z Entering 'third_party/cutlass' 2025-09-07T07:50:40.9734237Z Entering 'third_party/fbgemm' 2025-09-07T07:50:40.9776071Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:40.9812500Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:40.9855281Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:40.9891275Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:40.9936088Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:40.9972464Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:41.0009263Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:41.0048542Z Entering 'third_party/flash-attention' 2025-09-07T07:50:41.0087632Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:41.0129718Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:41.0175579Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:41.0216551Z Entering 'third_party/fmt' 2025-09-07T07:50:41.0258030Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:41.0296918Z Entering 'third_party/gloo' 2025-09-07T07:50:41.0335912Z Entering 'third_party/googletest' 2025-09-07T07:50:41.0374805Z Entering 'third_party/ideep' 2025-09-07T07:50:41.0412141Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:41.0456171Z Entering 'third_party/ittapi' 2025-09-07T07:50:41.0494387Z Entering 'third_party/kineto' 2025-09-07T07:50:41.0531928Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:41.0569028Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:41.0607015Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:41.0643769Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:41.0681398Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:41.0719910Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:41.0758383Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:41.0797205Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:41.0835157Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:41.0873237Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:41.0911078Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:41.0947846Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:41.0985902Z Entering 'third_party/kleidiai' 2025-09-07T07:50:41.1025005Z Entering 'third_party/mimalloc' 2025-09-07T07:50:41.1063199Z Entering 'third_party/nlohmann' 2025-09-07T07:50:41.1103549Z Entering 'third_party/onnx' 2025-09-07T07:50:41.1160730Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:41.1200787Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:41.1242124Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:41.1277402Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:41.1318852Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:41.1355302Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:41.1393266Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:41.1429238Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:41.1465694Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:41.1501764Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:41.1540116Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:41.1578075Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:41.1635978Z Entering 'third_party/pocketfft' 2025-09-07T07:50:41.1674958Z Entering 'third_party/protobuf' 2025-09-07T07:50:41.1716799Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:41.1752804Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:41.1792839Z Entering 'third_party/psimd' 2025-09-07T07:50:41.1829216Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:41.1867619Z Entering 'third_party/pybind11' 2025-09-07T07:50:41.1906468Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:41.1945007Z Entering 'third_party/sleef' 2025-09-07T07:50:41.1983928Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:41.2022350Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:41.2058300Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:41.2097629Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:41.2135000Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:41.2172966Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:41.2223772Z ##[endgroup] 2025-09-07T07:50:41.2269132Z [command]/usr/bin/git log -1 --format=%H 2025-09-07T07:50:41.2292870Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:41.2454814Z ##[group]Run actions/checkout@v4 2025-09-07T07:50:41.2455123Z with: 2025-09-07T07:50:41.2455379Z ref: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:41.2455711Z fetch-depth: 0 2025-09-07T07:50:41.2455963Z submodules: recursive 2025-09-07T07:50:41.2456236Z show-progress: false 2025-09-07T07:50:41.2456525Z repository: pytorch/pytorch 2025-09-07T07:50:41.2456933Z token: *** 2025-09-07T07:50:41.2457168Z ssh-strict: true 2025-09-07T07:50:41.2457503Z ssh-user: git 2025-09-07T07:50:41.2457764Z persist-credentials: true 2025-09-07T07:50:41.2458032Z clean: true 2025-09-07T07:50:41.2458304Z sparse-checkout-cone-mode: true 2025-09-07T07:50:41.2458613Z fetch-tags: false 2025-09-07T07:50:41.2458852Z lfs: false 2025-09-07T07:50:41.2459074Z set-safe-directory: true 2025-09-07T07:50:41.2459341Z env: 2025-09-07T07:50:41.2459572Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:41.2459837Z ##[endgroup] 2025-09-07T07:50:41.3417156Z Syncing repository: pytorch/pytorch 2025-09-07T07:50:41.3420335Z ##[group]Getting Git version info 2025-09-07T07:50:41.3420751Z Working directory is '/home/charlie/_work/pytorch/pytorch' 2025-09-07T07:50:41.3453366Z [command]/usr/bin/git version 2025-09-07T07:50:41.3488238Z git version 2.51.0 2025-09-07T07:50:41.3511680Z ##[endgroup] 2025-09-07T07:50:41.3523315Z Temporarily overriding HOME='/home/charlie/_work/_temp/3e910201-4530-42c4-98de-4e5cbcc71f86' before making global git config changes 2025-09-07T07:50:41.3524229Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T07:50:41.3528585Z [command]/usr/bin/git config --global --add safe.directory /home/charlie/_work/pytorch/pytorch 2025-09-07T07:50:41.3565525Z [command]/usr/bin/git config --local --get remote.origin.url 2025-09-07T07:50:41.3585852Z https://github.com/pytorch/pytorch 2025-09-07T07:50:41.3598698Z ##[group]Removing previously created refs, to avoid conflicts 2025-09-07T07:50:41.3602107Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-09-07T07:50:41.3622517Z HEAD 2025-09-07T07:50:41.3657568Z ##[endgroup] 2025-09-07T07:50:41.3661092Z [command]/usr/bin/git submodule status 2025-09-07T07:50:41.3933929Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-09-07T07:50:41.4017717Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-09-07T07:50:41.4099209Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-09-07T07:50:41.4194013Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-09-07T07:50:41.4250600Z 2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07 third_party/NVTX (v3.1.0-263-g2942f16) 2025-09-07T07:50:41.4341048Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-09-07T07:50:41.4981175Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-09-07T07:50:41.5019760Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-09-07T07:50:41.5044698Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-09-07T07:50:41.5127848Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-09-07T07:50:41.5268255Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-09-07T07:50:41.5391825Z 5e3d2445e6a84d9599bee2bf78edbb4d80865e1d third_party/cpuinfo (5e3d244) 2025-09-07T07:50:41.5432247Z f937055efc6d414d11f4c6577e3977fe74f35fb6 third_party/cudnn_frontend (v0.5-52-gf937055) 2025-09-07T07:50:41.5533488Z e51efbfe18fe4f4cbb66ab814c55bf4aa0185491 third_party/cutlass (v4.1.0) 2025-09-07T07:50:41.5595297Z 4b39c551efe15e6bbade20565b0ceb2d8ce3352d third_party/fbgemm (v1.3.0-rc1-342-g4b39c551) 2025-09-07T07:50:41.5687630Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-09-07T07:50:41.5712641Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-09-07T07:50:41.6138618Z 40626af88bd7df9a5fb80be7b25ac85b122d6c21 third_party/fmt (11.2.0) 2025-09-07T07:50:41.6250246Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-09-07T07:50:41.6376809Z c7b7b022c124d9643957d9bd55f57ac59fce8fa2 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-33-gc7b7b02) 2025-09-07T07:50:41.6623489Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-09-07T07:50:41.6708269Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-09-07T07:50:41.6796649Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-09-07T07:50:41.7065676Z 5e7501833f1021ce6f618572d3baf657b6319658 third_party/kineto (remotes/origin/sraikund/test-98-g5e75018) 2025-09-07T07:50:41.7093298Z cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7 third_party/kleidiai (v1.8.0) 2025-09-07T07:50:41.7119481Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-09-07T07:50:41.7145777Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-09-07T07:50:41.7489121Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-09-07T07:50:41.7515825Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-09-07T07:50:41.7544794Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-09-07T07:50:41.7912380Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-09-07T07:50:41.7989764Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-09-07T07:50:41.8049090Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-09-07T07:50:41.8075663Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-09-07T07:50:41.8152202Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-09-07T07:50:41.8236141Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-09-07T07:50:41.8309678Z af0118d13e52f5a08841464a768e01a0bf3e3075 third_party/tensorpipe (heads/main) 2025-09-07T07:50:41.8320459Z ##[group]Cleaning the repository 2025-09-07T07:50:41.8324965Z [command]/usr/bin/git clean -ffdx 2025-09-07T07:50:41.8633197Z [command]/usr/bin/git reset --hard HEAD 2025-09-07T07:50:42.0917397Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:50:42.0942760Z ##[endgroup] 2025-09-07T07:50:42.0943978Z ##[group]Disabling automatic garbage collection 2025-09-07T07:50:42.0949332Z [command]/usr/bin/git config --local gc.auto 0 2025-09-07T07:50:42.0980294Z ##[endgroup] 2025-09-07T07:50:42.0981003Z ##[group]Setting up auth 2025-09-07T07:50:42.0985795Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T07:50:42.1018653Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T07:50:42.1262051Z Entering 'android/libs/fbjni' 2025-09-07T07:50:42.1304601Z Entering 'third_party/FP16' 2025-09-07T07:50:42.1347470Z Entering 'third_party/FXdiv' 2025-09-07T07:50:42.1390124Z Entering 'third_party/NNPACK' 2025-09-07T07:50:42.1434016Z Entering 'third_party/NVTX' 2025-09-07T07:50:42.1476435Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:42.1519499Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:42.1578357Z Entering 'third_party/aiter' 2025-09-07T07:50:42.1622666Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:42.1913517Z Entering 'third_party/benchmark' 2025-09-07T07:50:42.1956403Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:42.2008767Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:42.2055159Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:42.2098589Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:42.2142920Z Entering 'third_party/cutlass' 2025-09-07T07:50:42.2197439Z Entering 'third_party/fbgemm' 2025-09-07T07:50:42.2244081Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:42.2285438Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:42.2335533Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:42.2378833Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:42.2432302Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:42.2477481Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:42.2518951Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:42.2564448Z Entering 'third_party/flash-attention' 2025-09-07T07:50:42.2607298Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:42.2655237Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:42.2708071Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:42.2753091Z Entering 'third_party/fmt' 2025-09-07T07:50:42.2796852Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:42.2839286Z Entering 'third_party/gloo' 2025-09-07T07:50:42.2882541Z Entering 'third_party/googletest' 2025-09-07T07:50:42.2926073Z Entering 'third_party/ideep' 2025-09-07T07:50:42.2968292Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:42.3017879Z Entering 'third_party/ittapi' 2025-09-07T07:50:42.3061325Z Entering 'third_party/kineto' 2025-09-07T07:50:42.3104736Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:42.3147021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:42.3189663Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:42.3231416Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:42.3271729Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:42.3312897Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:42.3355404Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:42.3395604Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:42.3437888Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:42.3480591Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:42.3523920Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:42.3564354Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:42.3607532Z Entering 'third_party/kleidiai' 2025-09-07T07:50:42.3650343Z Entering 'third_party/mimalloc' 2025-09-07T07:50:42.3693791Z Entering 'third_party/nlohmann' 2025-09-07T07:50:42.3737904Z Entering 'third_party/onnx' 2025-09-07T07:50:42.3799536Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:42.3845131Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:42.3889486Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:42.3930356Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:42.3971635Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:42.4012709Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:42.4054937Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:42.4096084Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:42.4137162Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:42.4179058Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:42.4221151Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:42.4266157Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:42.4327506Z Entering 'third_party/pocketfft' 2025-09-07T07:50:42.4370764Z Entering 'third_party/protobuf' 2025-09-07T07:50:42.4417338Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:42.4458484Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:42.4502608Z Entering 'third_party/psimd' 2025-09-07T07:50:42.4545101Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:42.4587687Z Entering 'third_party/pybind11' 2025-09-07T07:50:42.4631425Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:42.4674632Z Entering 'third_party/sleef' 2025-09-07T07:50:42.4717646Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:42.4760396Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:42.4802124Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:42.4842094Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:42.4884925Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:42.4925888Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:42.4986248Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T07:50:42.5012091Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5019274Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-09-07T07:50:42.5051339Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T07:50:42.5285198Z Entering 'android/libs/fbjni' 2025-09-07T07:50:42.5309458Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5341446Z Entering 'third_party/FP16' 2025-09-07T07:50:42.5367621Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5400378Z Entering 'third_party/FXdiv' 2025-09-07T07:50:42.5426712Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5458372Z Entering 'third_party/NNPACK' 2025-09-07T07:50:42.5483409Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5517786Z Entering 'third_party/NVTX' 2025-09-07T07:50:42.5542540Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5574292Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:42.5599211Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5630077Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:42.5656703Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5704524Z Entering 'third_party/aiter' 2025-09-07T07:50:42.5730040Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5764162Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:42.5787550Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5831609Z Entering 'third_party/benchmark' 2025-09-07T07:50:42.5856747Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5889037Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:42.5914090Z http.https://github.com/.extraheader 2025-09-07T07:50:42.5956635Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:42.5980885Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6012451Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:42.6037972Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6070047Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:42.6095351Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6127644Z Entering 'third_party/cutlass' 2025-09-07T07:50:42.6152979Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6193239Z Entering 'third_party/fbgemm' 2025-09-07T07:50:42.6218509Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6440972Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:42.6464521Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6496108Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:42.6520217Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6559196Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:42.6583111Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6613901Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:42.6637662Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6677665Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:42.6702299Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6733141Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:42.6756569Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6788952Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:42.6812806Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6847483Z Entering 'third_party/flash-attention' 2025-09-07T07:50:42.6872485Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6905832Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:42.6930464Z http.https://github.com/.extraheader 2025-09-07T07:50:42.6969815Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:42.6993893Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7035691Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:42.7061746Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7096608Z Entering 'third_party/fmt' 2025-09-07T07:50:42.7121605Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7152713Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:42.7178148Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7210324Z Entering 'third_party/gloo' 2025-09-07T07:50:42.7235459Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7266842Z Entering 'third_party/googletest' 2025-09-07T07:50:42.7292839Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7326047Z Entering 'third_party/ideep' 2025-09-07T07:50:42.7350998Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7382220Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:42.7406113Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7447537Z Entering 'third_party/ittapi' 2025-09-07T07:50:42.7471680Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7502787Z Entering 'third_party/kineto' 2025-09-07T07:50:42.7527895Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7617768Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:42.7641426Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7672834Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:42.7697185Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7730102Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:42.7754262Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7786019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:42.7809798Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7844840Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:42.7867757Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7897945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:42.7921687Z http.https://github.com/.extraheader 2025-09-07T07:50:42.7954680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:42.7978335Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8011527Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:42.8035785Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8067995Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:42.8093539Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8127078Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:42.8151183Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8188340Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:42.8214097Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8244128Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:42.8268435Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8301565Z Entering 'third_party/kleidiai' 2025-09-07T07:50:42.8327751Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8359624Z Entering 'third_party/mimalloc' 2025-09-07T07:50:42.8383805Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8415950Z Entering 'third_party/nlohmann' 2025-09-07T07:50:42.8440692Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8473732Z Entering 'third_party/onnx' 2025-09-07T07:50:42.8498738Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8548818Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:42.8573349Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8607763Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:42.8632849Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8670332Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:42.8693554Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8724340Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:42.8748204Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8781128Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:42.8805446Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8838131Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:42.8861791Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8894957Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:42.8918657Z http.https://github.com/.extraheader 2025-09-07T07:50:42.8948872Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:42.8973777Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9005270Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:42.9029034Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9058920Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:42.9082299Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9115322Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:42.9138603Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9174498Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:42.9199001Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9252137Z Entering 'third_party/pocketfft' 2025-09-07T07:50:42.9278586Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9309607Z Entering 'third_party/protobuf' 2025-09-07T07:50:42.9334031Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9373171Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:42.9397136Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9428779Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:42.9453006Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9486726Z Entering 'third_party/psimd' 2025-09-07T07:50:42.9512503Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9544329Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:42.9570170Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9602738Z Entering 'third_party/pybind11' 2025-09-07T07:50:42.9629031Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9661382Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:42.9687152Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9720155Z Entering 'third_party/sleef' 2025-09-07T07:50:42.9745992Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9778535Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:42.9803907Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9837600Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:42.9860747Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9893773Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:42.9917838Z http.https://github.com/.extraheader 2025-09-07T07:50:42.9949795Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:42.9974256Z http.https://github.com/.extraheader 2025-09-07T07:50:43.0005475Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:43.0030215Z http.https://github.com/.extraheader 2025-09-07T07:50:43.0061305Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:43.0085819Z http.https://github.com/.extraheader 2025-09-07T07:50:43.0138723Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:50:43.0172741Z ##[endgroup] 2025-09-07T07:50:43.0173182Z ##[group]Fetching the repository 2025-09-07T07:50:43.0180372Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-09-07T07:50:43.5096306Z [command]/usr/bin/git rev-parse --verify --quiet 93fb23d6fae7c4e82c4239a1033e522088742634^{object} 2025-09-07T07:50:43.5122181Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:43.5124728Z ##[endgroup] 2025-09-07T07:50:43.5125156Z ##[group]Determining the checkout info 2025-09-07T07:50:43.5126013Z ##[endgroup] 2025-09-07T07:50:43.5129690Z [command]/usr/bin/git sparse-checkout disable 2025-09-07T07:50:43.5340622Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-09-07T07:50:43.5367389Z ##[group]Checking out the ref 2025-09-07T07:50:43.5371512Z [command]/usr/bin/git checkout --progress --force 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:43.5785734Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:50:43.5793049Z ##[endgroup] 2025-09-07T07:50:43.5793477Z ##[group]Setting up auth for fetching submodules 2025-09-07T07:50:43.5798166Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:50:43.5839513Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-09-07T07:50:43.5864005Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-09-07T07:50:43.5892960Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-09-07T07:50:43.5917554Z ##[endgroup] 2025-09-07T07:50:43.5918015Z ##[group]Fetching submodules 2025-09-07T07:50:43.5920947Z [command]/usr/bin/git submodule sync --recursive 2025-09-07T07:50:43.6168238Z Synchronizing submodule url for 'android/libs/fbjni' 2025-09-07T07:50:43.6190168Z Synchronizing submodule url for 'third_party/FP16' 2025-09-07T07:50:43.6213126Z Synchronizing submodule url for 'third_party/FXdiv' 2025-09-07T07:50:43.6236387Z Synchronizing submodule url for 'third_party/NNPACK' 2025-09-07T07:50:43.6259018Z Synchronizing submodule url for 'third_party/NVTX' 2025-09-07T07:50:43.6282774Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:43.6306659Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-09-07T07:50:43.6346000Z Synchronizing submodule url for 'third_party/aiter' 2025-09-07T07:50:43.6371571Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:43.6403714Z Synchronizing submodule url for 'third_party/benchmark' 2025-09-07T07:50:43.6426969Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-09-07T07:50:43.6460980Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-09-07T07:50:43.6485091Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-09-07T07:50:43.6508851Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-09-07T07:50:43.6530579Z Synchronizing submodule url for 'third_party/cutlass' 2025-09-07T07:50:43.6563225Z Synchronizing submodule url for 'third_party/fbgemm' 2025-09-07T07:50:43.6586418Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:43.6607903Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:43.6643682Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:43.6663859Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:43.6695810Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:43.6718008Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:43.6739038Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-09-07T07:50:43.6768919Z Synchronizing submodule url for 'third_party/flash-attention' 2025-09-07T07:50:43.6789560Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:43.6817095Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:43.6853326Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-09-07T07:50:43.6879319Z Synchronizing submodule url for 'third_party/fmt' 2025-09-07T07:50:43.6903220Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:43.6926738Z Synchronizing submodule url for 'third_party/gloo' 2025-09-07T07:50:43.6949400Z Synchronizing submodule url for 'third_party/googletest' 2025-09-07T07:50:43.6972910Z Synchronizing submodule url for 'third_party/ideep' 2025-09-07T07:50:43.6993073Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:43.7022826Z Synchronizing submodule url for 'third_party/ittapi' 2025-09-07T07:50:43.7050253Z Synchronizing submodule url for 'third_party/kineto' 2025-09-07T07:50:43.7075305Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:43.7096424Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:43.7119560Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:43.7141165Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:43.7162990Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:43.7185925Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:43.7210569Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:43.7232142Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:43.7253179Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:43.7276885Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:43.7298714Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:43.7319137Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:43.7342524Z Synchronizing submodule url for 'third_party/kleidiai' 2025-09-07T07:50:43.7366081Z Synchronizing submodule url for 'third_party/mimalloc' 2025-09-07T07:50:43.7389103Z Synchronizing submodule url for 'third_party/nlohmann' 2025-09-07T07:50:43.7413910Z Synchronizing submodule url for 'third_party/onnx' 2025-09-07T07:50:43.7458611Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:43.7484785Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-09-07T07:50:43.7506981Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:43.7528514Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:43.7549366Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:43.7571272Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:43.7593095Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:43.7612662Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:43.7633933Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:43.7654499Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:43.7676677Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:43.7700388Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:43.7744864Z Synchronizing submodule url for 'third_party/pocketfft' 2025-09-07T07:50:43.7768565Z Synchronizing submodule url for 'third_party/protobuf' 2025-09-07T07:50:43.7803581Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:43.7823947Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:43.7849019Z Synchronizing submodule url for 'third_party/psimd' 2025-09-07T07:50:43.7872147Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-09-07T07:50:43.7895239Z Synchronizing submodule url for 'third_party/pybind11' 2025-09-07T07:50:43.7922254Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-09-07T07:50:43.7945000Z Synchronizing submodule url for 'third_party/sleef' 2025-09-07T07:50:43.7968037Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-09-07T07:50:43.7988452Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:43.8009453Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:43.8034132Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:43.8055942Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:43.8075784Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:43.8110279Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-09-07T07:50:43.8864155Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-09-07T07:50:43.9033105Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-09-07T07:50:43.9179523Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-09-07T07:50:43.9445750Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-09-07T07:50:44.0427187Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-09-07T07:50:44.0827699Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-09-07T07:50:44.2068791Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-09-07T07:50:44.4546189Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-09-07T07:50:44.7801894Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-09-07T07:50:44.8021467Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-09-07T07:50:45.1735174Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-09-07T07:50:45.3051475Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-09-07T07:50:45.4521343Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-09-07T07:50:45.5049949Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-09-07T07:50:46.1729551Z Submodule path 'third_party/cutlass': checked out 'e51efbfe18fe4f4cbb66ab814c55bf4aa0185491' 2025-09-07T07:50:46.3209934Z Submodule path 'third_party/fbgemm': checked out '4b39c551efe15e6bbade20565b0ceb2d8ce3352d' 2025-09-07T07:50:46.3901499Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-09-07T07:50:46.6896528Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out 'b1281b8b08d973a7064f864f47eeb30f3e2596e9' 2025-09-07T07:50:46.8374308Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-09-07T07:50:47.1561865Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-09-07T07:50:47.1966260Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:50:47.2101506Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-09-07T07:50:47.3343947Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-09-07T07:50:47.4209907Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-09-07T07:50:47.7041909Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-09-07T07:50:48.3287974Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-09-07T07:50:48.6392182Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-09-07T07:50:48.6875388Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-09-07T07:50:48.7330995Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-09-07T07:50:48.7561124Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-09-07T07:50:48.7952977Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:50:48.8092741Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-09-07T07:50:49.2446342Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-09-07T07:50:49.2673757Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-09-07T07:50:49.3718624Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-09-07T07:50:49.5471622Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-09-07T07:50:49.7609989Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-09-07T07:50:49.7772418Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-09-07T07:50:49.8111500Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-09-07T07:50:49.8253757Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-09-07T07:50:49.8348163Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-09-07T07:50:49.8511017Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-09-07T07:50:49.8896893Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-09-07T07:50:49.9889986Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-09-07T07:50:50.0056791Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-09-07T07:50:50.0360913Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-09-07T07:50:50.0749412Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-09-07T07:50:50.1243439Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-09-07T07:50:50.1864577Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-09-07T07:50:50.3178836Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-09-07T07:50:50.4579380Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-09-07T07:50:50.5073641Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-09-07T07:50:50.5775195Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-09-07T07:50:50.5956589Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-09-07T07:50:50.6341945Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-09-07T07:50:50.6495374Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-09-07T07:50:50.7828607Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-09-07T07:50:50.7994280Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-09-07T07:50:50.8142497Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-09-07T07:50:50.8286544Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-09-07T07:50:51.2857204Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-09-07T07:50:51.3437439Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-09-07T07:50:51.5737935Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-09-07T07:50:51.5874413Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-09-07T07:50:51.9365623Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-09-07T07:50:51.9534681Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-09-07T07:50:51.9982384Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-09-07T07:50:52.0085246Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-09-07T07:50:52.0245408Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-09-07T07:50:52.0633499Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-09-07T07:50:52.1146071Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-09-07T07:50:52.1706143Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-09-07T07:50:52.1940127Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-09-07T07:50:52.2320538Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-09-07T07:50:52.2477059Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-09-07T07:50:52.2811331Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-09-07T07:50:52.3063757Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-09-07T07:50:52.3154452Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-09-07T07:50:52.3194095Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-09-07T07:50:52.3456164Z Entering 'android/libs/fbjni' 2025-09-07T07:50:52.3494527Z Entering 'third_party/FP16' 2025-09-07T07:50:52.3535157Z Entering 'third_party/FXdiv' 2025-09-07T07:50:52.3573801Z Entering 'third_party/NNPACK' 2025-09-07T07:50:52.3612003Z Entering 'third_party/NVTX' 2025-09-07T07:50:52.3652736Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:52.3693143Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:52.3748867Z Entering 'third_party/aiter' 2025-09-07T07:50:52.3788261Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:52.3836648Z Entering 'third_party/benchmark' 2025-09-07T07:50:52.3876368Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:52.3924377Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:52.3964653Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:52.4004464Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:52.4042657Z Entering 'third_party/cutlass' 2025-09-07T07:50:52.4091920Z Entering 'third_party/fbgemm' 2025-09-07T07:50:52.4135796Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:52.4172230Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:52.4217726Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:52.4255080Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:52.4300318Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:52.4336965Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:52.4372690Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:52.4412825Z Entering 'third_party/flash-attention' 2025-09-07T07:50:52.4453731Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:52.4496287Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:52.4543467Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:52.4585256Z Entering 'third_party/fmt' 2025-09-07T07:50:52.4625570Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:52.4663350Z Entering 'third_party/gloo' 2025-09-07T07:50:52.4704872Z Entering 'third_party/googletest' 2025-09-07T07:50:52.4744984Z Entering 'third_party/ideep' 2025-09-07T07:50:52.4786022Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:52.4831732Z Entering 'third_party/ittapi' 2025-09-07T07:50:52.4870413Z Entering 'third_party/kineto' 2025-09-07T07:50:52.4945893Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:52.4991272Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:52.5028590Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:52.5088542Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:52.5142382Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:52.5186458Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:52.5235395Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:52.5290326Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:52.5335980Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:52.5393041Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:52.5432928Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:52.5476326Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:52.5554267Z Entering 'third_party/kleidiai' 2025-09-07T07:50:52.5677076Z Entering 'third_party/mimalloc' 2025-09-07T07:50:52.5715752Z Entering 'third_party/nlohmann' 2025-09-07T07:50:52.5756793Z Entering 'third_party/onnx' 2025-09-07T07:50:52.5813149Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:52.5853947Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:52.5923663Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:52.5959864Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:52.5998224Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:52.6035943Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:52.6078788Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:52.6122311Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:52.6164486Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:52.6239191Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:52.6300432Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:52.6358634Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:52.6419368Z Entering 'third_party/pocketfft' 2025-09-07T07:50:52.6466929Z Entering 'third_party/protobuf' 2025-09-07T07:50:52.6507477Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:52.6560785Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:52.6607306Z Entering 'third_party/psimd' 2025-09-07T07:50:52.6668045Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:52.6706434Z Entering 'third_party/pybind11' 2025-09-07T07:50:52.6754698Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:52.6825596Z Entering 'third_party/sleef' 2025-09-07T07:50:52.6865959Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:52.6908260Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:52.6951934Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:52.6992040Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:52.7029100Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:52.7066403Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:52.7128886Z ##[endgroup] 2025-09-07T07:50:52.7129360Z ##[group]Persisting credentials for submodules 2025-09-07T07:50:52.7135579Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-09-07T07:50:52.7386655Z Entering 'android/libs/fbjni' 2025-09-07T07:50:52.7412984Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7413348Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7451439Z Entering 'third_party/FP16' 2025-09-07T07:50:52.7477463Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7477843Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7510121Z Entering 'third_party/FXdiv' 2025-09-07T07:50:52.7536502Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7536915Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7593347Z Entering 'third_party/NNPACK' 2025-09-07T07:50:52.7619574Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7620195Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7652318Z Entering 'third_party/NVTX' 2025-09-07T07:50:52.7676777Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7677200Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7713727Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:52.7738828Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7739165Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7779399Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:52.7804360Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7804958Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7860442Z Entering 'third_party/aiter' 2025-09-07T07:50:52.7881418Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7882016Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7915646Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:52.7941505Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7941924Z url.https://github.com/.insteadof 2025-09-07T07:50:52.7984892Z Entering 'third_party/benchmark' 2025-09-07T07:50:52.8012801Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8013165Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8051637Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:52.8078561Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8078938Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8124399Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:52.8149339Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8149704Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8181243Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:52.8208387Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8208789Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8243890Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:52.8271388Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8271749Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8306465Z Entering 'third_party/cutlass' 2025-09-07T07:50:52.8332905Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8333277Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8377135Z Entering 'third_party/fbgemm' 2025-09-07T07:50:52.8402595Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8403118Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8438302Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:52.8462779Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8463139Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8498078Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:52.8522056Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8522416Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8561516Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:52.8586818Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8587186Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8619844Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:52.8644099Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8644521Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8686247Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:52.8710922Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8711261Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8742402Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:52.8769220Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8769835Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8802421Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:52.8830297Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8830674Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8867718Z Entering 'third_party/flash-attention' 2025-09-07T07:50:52.8894347Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8894709Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8931689Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:52.8955918Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8956514Z url.https://github.com/.insteadof 2025-09-07T07:50:52.8995152Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:52.9021235Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9021592Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9073106Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:52.9099690Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9100033Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9134802Z Entering 'third_party/fmt' 2025-09-07T07:50:52.9160560Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9160966Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9191885Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:52.9217943Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9218291Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9250238Z Entering 'third_party/gloo' 2025-09-07T07:50:52.9276065Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9276654Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9310623Z Entering 'third_party/googletest' 2025-09-07T07:50:52.9337372Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9337846Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9369865Z Entering 'third_party/ideep' 2025-09-07T07:50:52.9398215Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9398558Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9433342Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:52.9457508Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9457861Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9498878Z Entering 'third_party/ittapi' 2025-09-07T07:50:52.9524794Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9525176Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9556439Z Entering 'third_party/kineto' 2025-09-07T07:50:52.9582078Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9582456Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9614528Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:52.9639458Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9639860Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9678544Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:52.9704771Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9705155Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9739820Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:52.9767060Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9767457Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9798940Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:52.9823033Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9823413Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9860629Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:52.9886632Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9887033Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9917843Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:52.9943035Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9943395Z url.https://github.com/.insteadof 2025-09-07T07:50:52.9979537Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:53.0004103Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0004491Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0038775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:53.0063638Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0064292Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0096613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:53.0122659Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0123196Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0158093Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:53.0184401Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0184755Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0219070Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:53.0243791Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0244171Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0279097Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:53.0303567Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0303949Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0339649Z Entering 'third_party/kleidiai' 2025-09-07T07:50:53.0365815Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0366207Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0400764Z Entering 'third_party/mimalloc' 2025-09-07T07:50:53.0426585Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0426940Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0458904Z Entering 'third_party/nlohmann' 2025-09-07T07:50:53.0484728Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0485088Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0517069Z Entering 'third_party/onnx' 2025-09-07T07:50:53.0543235Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0543579Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0595150Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:53.0619178Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0619555Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0654615Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:53.0680162Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0680539Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0715542Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:53.0739503Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0739876Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0774258Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:53.0798296Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0798675Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0830466Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:53.0855391Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0855760Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0891471Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:53.0915515Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0915959Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0954848Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:53.0980350Z url.https://github.com/.insteadof 2025-09-07T07:50:53.0980762Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1015207Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:53.1039583Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1039936Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1072700Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:53.1096538Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1096903Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1132109Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:53.1156680Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1157051Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1192694Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:53.1217641Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1218017Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1255631Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:53.1279750Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1280097Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1335215Z Entering 'third_party/pocketfft' 2025-09-07T07:50:53.1361003Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1361668Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1400467Z Entering 'third_party/protobuf' 2025-09-07T07:50:53.1427358Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1427699Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1463122Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:53.1487573Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1487938Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1522003Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:53.1546367Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1546743Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1585729Z Entering 'third_party/psimd' 2025-09-07T07:50:53.1611582Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1611992Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1642746Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:53.1668480Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1668867Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1705027Z Entering 'third_party/pybind11' 2025-09-07T07:50:53.1732924Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1733324Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1768983Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:53.1794919Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1795275Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1826610Z Entering 'third_party/sleef' 2025-09-07T07:50:53.1852759Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1853208Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1884379Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:53.1909410Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1909788Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1942007Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:53.1965895Z url.https://github.com/.insteadof 2025-09-07T07:50:53.1966254Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2005201Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:53.2029119Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2029475Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2061788Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:53.2087253Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2087645Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2118996Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:53.2143772Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2144131Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2175872Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:53.2202806Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2203324Z url.https://github.com/.insteadof 2025-09-07T07:50:53.2257060Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-09-07T07:50:53.2502440Z Entering 'android/libs/fbjni' 2025-09-07T07:50:53.2541858Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-09-07T07:50:53.2561414Z Entering 'third_party/FP16' 2025-09-07T07:50:53.2601782Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-09-07T07:50:53.2620715Z Entering 'third_party/FXdiv' 2025-09-07T07:50:53.2664796Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-09-07T07:50:53.2683557Z Entering 'third_party/NNPACK' 2025-09-07T07:50:53.2727226Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-09-07T07:50:53.2746693Z Entering 'third_party/NVTX' 2025-09-07T07:50:53.2790868Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-09-07T07:50:53.2812238Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:53.2854989Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-09-07T07:50:53.2873837Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:53.2912585Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-09-07T07:50:53.2953692Z Entering 'third_party/aiter' 2025-09-07T07:50:53.2993100Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-09-07T07:50:53.3037132Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:53.3077373Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-09-07T07:50:53.3144934Z Entering 'third_party/benchmark' 2025-09-07T07:50:53.3188982Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:53.3211177Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:53.3248788Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-09-07T07:50:53.3277106Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:53.3315774Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-09-07T07:50:53.3334425Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:53.3372466Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-09-07T07:50:53.3391774Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:53.3431128Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-09-07T07:50:53.3450851Z Entering 'third_party/cutlass' 2025-09-07T07:50:53.3489214Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-09-07T07:50:53.3518417Z Entering 'third_party/fbgemm' 2025-09-07T07:50:53.3557030Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-09-07T07:50:53.3578501Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:53.3616126Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-09-07T07:50:53.3634370Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:53.3671320Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-09-07T07:50:53.3702222Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:53.3743630Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-09-07T07:50:53.3761698Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:53.3799402Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-09-07T07:50:53.3833891Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:53.3875460Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-09-07T07:50:53.3893643Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:53.3933730Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-09-07T07:50:53.3950250Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:53.3988878Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-09-07T07:50:53.4012081Z Entering 'third_party/flash-attention' 2025-09-07T07:50:53.4052346Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-09-07T07:50:53.4071954Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:53.4119665Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-09-07T07:50:53.4138206Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:53.4179280Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-09-07T07:50:53.4209108Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:53.4250834Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-09-07T07:50:53.4273213Z Entering 'third_party/fmt' 2025-09-07T07:50:53.4313512Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:50:53.4336082Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:53.4376451Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-09-07T07:50:53.4395341Z Entering 'third_party/gloo' 2025-09-07T07:50:53.4436322Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-09-07T07:50:53.4457242Z Entering 'third_party/googletest' 2025-09-07T07:50:53.4503410Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.4523786Z Entering 'third_party/ideep' 2025-09-07T07:50:53.4566932Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-09-07T07:50:53.4584832Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:53.4624518Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-09-07T07:50:53.4652658Z Entering 'third_party/ittapi' 2025-09-07T07:50:53.4691636Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-09-07T07:50:53.4710210Z Entering 'third_party/kineto' 2025-09-07T07:50:53.4754609Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-09-07T07:50:53.4773651Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:53.4821325Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-09-07T07:50:53.4838854Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:53.4878836Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-09-07T07:50:53.4898612Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:53.4942393Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-09-07T07:50:53.4959726Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:53.5002488Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:50:53.5021089Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:53.5066038Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-09-07T07:50:53.5085839Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:53.5125894Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-09-07T07:50:53.5145824Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:53.5188837Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-09-07T07:50:53.5206775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:53.5255968Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.5273614Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:53.5324815Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-09-07T07:50:53.5343543Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:53.5446341Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-09-07T07:50:53.5466881Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:53.5566493Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-09-07T07:50:53.5585329Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:53.5653538Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.5674397Z Entering 'third_party/kleidiai' 2025-09-07T07:50:53.5775880Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-09-07T07:50:53.5795208Z Entering 'third_party/mimalloc' 2025-09-07T07:50:53.5903005Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-09-07T07:50:53.5923920Z Entering 'third_party/nlohmann' 2025-09-07T07:50:53.6021227Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-09-07T07:50:53.6042635Z Entering 'third_party/onnx' 2025-09-07T07:50:53.6159917Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-09-07T07:50:53.6198513Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:53.6277300Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:53.6299078Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:53.6456219Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-09-07T07:50:53.6475608Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:53.6544623Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:53.6563443Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:53.6604344Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.6624622Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:53.6721542Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-09-07T07:50:53.6739661Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:53.6830782Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-09-07T07:50:53.6849476Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:53.6950185Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-09-07T07:50:53.6968906Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:53.7049912Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-09-07T07:50:53.7068173Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:53.7165571Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-09-07T07:50:53.7182826Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:53.7277303Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-09-07T07:50:53.7299506Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:53.7414244Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-09-07T07:50:53.7436512Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:53.7518604Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-09-07T07:50:53.7571303Z Entering 'third_party/pocketfft' 2025-09-07T07:50:53.7616258Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-09-07T07:50:53.7633960Z Entering 'third_party/protobuf' 2025-09-07T07:50:53.7689539Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-09-07T07:50:53.7710723Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:53.7793889Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:50:53.7816481Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:53.7912602Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.7937338Z Entering 'third_party/psimd' 2025-09-07T07:50:53.8035368Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-09-07T07:50:53.8053911Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:53.8144856Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-09-07T07:50:53.8169482Z Entering 'third_party/pybind11' 2025-09-07T07:50:53.8285988Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:53.8305840Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:53.8382290Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-09-07T07:50:53.8401518Z Entering 'third_party/sleef' 2025-09-07T07:50:53.8466085Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-09-07T07:50:53.8485466Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:53.8584216Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-09-07T07:50:53.8603749Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:53.8684953Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:50:53.8703337Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:53.8779981Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-09-07T07:50:53.8798136Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:54.0600059Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-09-07T07:50:54.0703333Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:54.0729272Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:50:54.0745772Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:54.0783158Z file:/home/charlie/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-09-07T07:50:54.5745456Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-09-07T07:50:54.6028712Z Entering 'android/libs/fbjni' 2025-09-07T07:50:54.6151408Z Entering 'third_party/FP16' 2025-09-07T07:50:54.6262876Z Entering 'third_party/FXdiv' 2025-09-07T07:50:54.6366828Z Entering 'third_party/NNPACK' 2025-09-07T07:50:54.6510477Z Entering 'third_party/NVTX' 2025-09-07T07:50:54.6613555Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:54.6690275Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:54.6831546Z Entering 'third_party/aiter' 2025-09-07T07:50:54.6957849Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:54.7048942Z Entering 'third_party/benchmark' 2025-09-07T07:50:54.7099783Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:54.7223835Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:54.7307215Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:54.7408374Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:54.7494900Z Entering 'third_party/cutlass' 2025-09-07T07:50:54.7634202Z Entering 'third_party/fbgemm' 2025-09-07T07:50:54.7728528Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:54.7818905Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:54.7919639Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:54.7981953Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:54.8044363Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:54.8139388Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:54.8230401Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:54.8302635Z Entering 'third_party/flash-attention' 2025-09-07T07:50:54.8410651Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:54.8525898Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:54.8630052Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:54.8720510Z Entering 'third_party/fmt' 2025-09-07T07:50:54.8831645Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:54.8919524Z Entering 'third_party/gloo' 2025-09-07T07:50:54.8967534Z Entering 'third_party/googletest' 2025-09-07T07:50:54.9069816Z Entering 'third_party/ideep' 2025-09-07T07:50:54.9166575Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:55.1486113Z Entering 'third_party/ittapi' 2025-09-07T07:50:55.1618327Z Entering 'third_party/kineto' 2025-09-07T07:50:55.1738916Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:55.1825928Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:55.2040998Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:55.2599530Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:55.2700964Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:55.2806712Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:55.2929276Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:55.3050251Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:55.3339547Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:55.3872986Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:55.3951990Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:55.4228075Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:55.4273003Z Entering 'third_party/kleidiai' 2025-09-07T07:50:55.4320419Z Entering 'third_party/mimalloc' 2025-09-07T07:50:55.4599476Z Entering 'third_party/nlohmann' 2025-09-07T07:50:55.4956620Z Entering 'third_party/onnx' 2025-09-07T07:50:55.5017142Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:55.5469582Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:55.5524655Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:55.5698252Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:55.6764964Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:55.6998114Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:55.7072736Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:55.7180708Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:55.7767652Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:55.8250049Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:55.8363198Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:55.8531676Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:55.8867273Z Entering 'third_party/pocketfft' 2025-09-07T07:50:55.9448645Z Entering 'third_party/protobuf' 2025-09-07T07:50:55.9539940Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:55.9807907Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:56.0443926Z Entering 'third_party/psimd' 2025-09-07T07:50:56.0992308Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:56.1599528Z Entering 'third_party/pybind11' 2025-09-07T07:50:56.1771112Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:56.1931623Z Entering 'third_party/sleef' 2025-09-07T07:50:56.2969061Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:56.4159447Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:56.4327608Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:56.4613703Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:56.5215479Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:56.5380224Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:56.5617868Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-09-07T07:50:56.5892856Z Entering 'android/libs/fbjni' 2025-09-07T07:50:56.5983369Z Entering 'third_party/FP16' 2025-09-07T07:50:56.6147003Z Entering 'third_party/FXdiv' 2025-09-07T07:50:56.6431984Z Entering 'third_party/NNPACK' 2025-09-07T07:50:56.6982218Z Entering 'third_party/NVTX' 2025-09-07T07:50:56.7125784Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:50:56.7257995Z Entering 'third_party/XNNPACK' 2025-09-07T07:50:56.7460364Z Entering 'third_party/aiter' 2025-09-07T07:50:56.7944826Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:50:56.8118599Z Entering 'third_party/benchmark' 2025-09-07T07:50:56.8290528Z Entering 'third_party/composable_kernel' 2025-09-07T07:50:56.8872397Z Entering 'third_party/cpp-httplib' 2025-09-07T07:50:56.9354392Z Entering 'third_party/cpuinfo' 2025-09-07T07:50:57.0931949Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:50:57.2554593Z Entering 'third_party/cutlass' 2025-09-07T07:50:57.2769099Z Entering 'third_party/fbgemm' 2025-09-07T07:50:57.3879047Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:50:57.5129877Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:50:57.5385711Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:50:57.5526091Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:50:57.7167631Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:50:57.7193604Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:50:57.7909612Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:50:57.8633769Z Entering 'third_party/flash-attention' 2025-09-07T07:50:58.0045193Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:50:58.0264224Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:50:58.0844482Z Entering 'third_party/flatbuffers' 2025-09-07T07:50:58.1640758Z Entering 'third_party/fmt' 2025-09-07T07:50:58.1858847Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:50:58.1976706Z Entering 'third_party/gloo' 2025-09-07T07:50:58.3702658Z Entering 'third_party/googletest' 2025-09-07T07:50:58.3804799Z Entering 'third_party/ideep' 2025-09-07T07:50:58.3897049Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:50:58.4043479Z Entering 'third_party/ittapi' 2025-09-07T07:50:58.5051341Z Entering 'third_party/kineto' 2025-09-07T07:50:58.5727631Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:50:58.5827700Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:50:58.5896035Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:50:58.5974417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:50:58.6103485Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:50:58.6185673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:50:58.6249406Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:50:58.6315723Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:50:58.6476824Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:50:58.6531949Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:50:58.6651466Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:50:58.6734385Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:50:58.6880927Z Entering 'third_party/kleidiai' 2025-09-07T07:50:58.6953668Z Entering 'third_party/mimalloc' 2025-09-07T07:50:58.7050344Z Entering 'third_party/nlohmann' 2025-09-07T07:50:58.7101787Z Entering 'third_party/onnx' 2025-09-07T07:50:58.7259970Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:50:58.7403777Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:50:58.7493046Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:50:58.7640508Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:50:58.7697942Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:50:58.7839125Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:50:58.7913129Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:50:58.8065647Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:50:58.8119493Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:50:58.8312093Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:50:58.8418216Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:50:58.8489462Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:50:58.8656128Z Entering 'third_party/pocketfft' 2025-09-07T07:50:58.8723232Z Entering 'third_party/protobuf' 2025-09-07T07:50:58.8876821Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:50:58.8936917Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:50:58.9126857Z Entering 'third_party/psimd' 2025-09-07T07:50:58.9252546Z Entering 'third_party/pthreadpool' 2025-09-07T07:50:58.9340655Z Entering 'third_party/pybind11' 2025-09-07T07:50:58.9441149Z Entering 'third_party/python-peachpy' 2025-09-07T07:50:58.9525084Z Entering 'third_party/sleef' 2025-09-07T07:50:58.9728880Z Entering 'third_party/tensorpipe' 2025-09-07T07:50:58.9847582Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:50:58.9923967Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:50:59.0010454Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:50:59.0099019Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:50:59.0142723Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:50:59.0304540Z ##[endgroup] 2025-09-07T07:50:59.0351009Z [command]/usr/bin/git log -1 --format=%H 2025-09-07T07:50:59.0376138Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:59.0602090Z Prepare all required actions 2025-09-07T07:50:59.0602973Z Getting action download info 2025-09-07T07:50:59.4811779Z ##[group]Run ./.github/actions/setup-linux 2025-09-07T07:50:59.4812141Z env: 2025-09-07T07:50:59.4812415Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:59.4812671Z ##[endgroup] 2025-09-07T07:50:59.5216163Z ##[group]Run set -euo pipefail 2025-09-07T07:50:59.5216819Z set -euo pipefail 2025-09-07T07:50:59.5217333Z function get_ec2_metadata() { 2025-09-07T07:50:59.5217957Z  # Pulled from instance metadata endpoint for EC2 2025-09-07T07:50:59.5219133Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2025-09-07T07:50:59.5219938Z  category=$1 2025-09-07T07:50:59.5220598Z  # If it is GCP runner (runner name contains gcp), do not run this 2025-09-07T07:50:59.5221392Z  runner_name_str=i-03028b1668c838483-1003 2025-09-07T07:50:59.5221979Z  if [[ -f /.inarc ]]; then 2025-09-07T07:50:59.5222903Z  echo "ARC Runner, no info on ec2 metadata" 2025-09-07T07:50:59.5223417Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2025-09-07T07:50:59.5224235Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2025-09-07T07:50:59.5225039Z  else 2025-09-07T07:50:59.5226583Z  curl -H "X-aws-ec2-metadata-token: $(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 30")" -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2025-09-07T07:50:59.5228146Z  fi 2025-09-07T07:50:59.5228519Z } 2025-09-07T07:50:59.5228989Z echo "ami-id: $(get_ec2_metadata ami-id)" 2025-09-07T07:50:59.5229574Z echo "instance-id: $(get_ec2_metadata instance-id)" 2025-09-07T07:50:59.5230213Z echo "instance-type: $(get_ec2_metadata instance-type)" 2025-09-07T07:50:59.5230638Z echo "system info $(uname -a)" 2025-09-07T07:50:59.5246752Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:50:59.5247258Z env: 2025-09-07T07:50:59.5247640Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:59.5248079Z ##[endgroup] 2025-09-07T07:50:59.5353700Z ami-id: ARC Runner, no info on ec2 metadata 2025-09-07T07:50:59.5359552Z instance-id: ARC Runner, no info on ec2 metadata 2025-09-07T07:50:59.5365629Z instance-type: ARC Runner, no info on ec2 metadata 2025-09-07T07:50:59.5376419Z system info Linux 15a98ee0aa9d 6.5.0-1024-aws #24~22.04.1-Ubuntu SMP Thu Jul 18 10:43:12 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux 2025-09-07T07:50:59.5498137Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:50:59.5499258Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:50:59.5512909Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:50:59.5513322Z env: 2025-09-07T07:50:59.5513533Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:59.5513883Z ##[endgroup] 2025-09-07T07:50:59.5828717Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T07:50:59.5829275Z with: 2025-09-07T07:50:59.5829639Z shell: bash 2025-09-07T07:50:59.5830033Z timeout_minutes: 5 2025-09-07T07:50:59.5830458Z max_attempts: 3 2025-09-07T07:50:59.5830880Z retry_wait_seconds: 30 2025-09-07T07:50:59.5835632Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" # For LF Runners we need to make sure we also login to Meta's ECR docker registry too. META_AWS_ACCOUNT_ID=308535385114 if [ "$AWS_ACCOUNT_ID" != "$META_AWS_ACCOUNT_ID" ] ; then aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$META_AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" fi 2025-09-07T07:50:59.5840264Z polling_interval_seconds: 1 2025-09-07T07:50:59.5840785Z warning_on_retry: true 2025-09-07T07:50:59.5841263Z continue_on_error: false 2025-09-07T07:50:59.5841728Z env: 2025-09-07T07:50:59.5842088Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:59.5842569Z AWS_RETRY_MODE: standard 2025-09-07T07:50:59.5843203Z AWS_MAX_ATTEMPTS: 5 2025-09-07T07:50:59.5843670Z AWS_DEFAULT_REGION: us-east-1 2025-09-07T07:50:59.5844155Z ##[endgroup] 2025-09-07T07:51:01.5377761Z 2025-09-07T07:51:01.5378431Z WARNING! Your credentials are stored unencrypted in '/home/charlie/.docker/config.json'. 2025-09-07T07:51:01.5379078Z Configure a credential helper to remove this warning. See 2025-09-07T07:51:01.5379536Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:51:01.5379799Z 2025-09-07T07:51:01.5379909Z Login Succeeded 2025-09-07T07:51:01.6678044Z Command completed after 1 attempt(s). 2025-09-07T07:51:01.6825822Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:51:01.6826600Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:51:01.6827097Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:51:01.6840109Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:01.6840496Z env: 2025-09-07T07:51:01.6840723Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.6840993Z ##[endgroup] 2025-09-07T07:51:01.7101423Z ##[group]Run set +e 2025-09-07T07:51:01.7101765Z set +e 2025-09-07T07:51:01.7102158Z set -x 2025-09-07T07:51:01.7102546Z  2025-09-07T07:51:01.7102807Z PT_DOMAIN=download.pytorch.org 2025-09-07T07:51:01.7103403Z # TODO: Flaky access to download.pytorch.org https://github.com/pytorch/pytorch/issues/100400, 2025-09-07T07:51:01.7104190Z # cleaning this up once the issue is fixed. There are more than one resolved IP here, the last 2025-09-07T07:51:01.7104757Z # one is returned at random 2025-09-07T07:51:01.7105177Z RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" | tail -n1) 2025-09-07T07:51:01.7105553Z  2025-09-07T07:51:01.7105791Z if [ -z "${RESOLVED_IP}" ]; then 2025-09-07T07:51:01.7106231Z  echo "Couldn't resolve ${PT_DOMAIN}, retrying with Google DNS..." 2025-09-07T07:51:01.7106772Z  RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" @8.8.8.8 | tail -n1) 2025-09-07T07:51:01.7107177Z  2025-09-07T07:51:01.7107403Z  if [ -z "${RESOLVED_IP}" ]; then 2025-09-07T07:51:01.7107795Z  echo "Couldn't resolve ${PT_DOMAIN}, exiting..." 2025-09-07T07:51:01.7108175Z  exit 1 2025-09-07T07:51:01.7108416Z  fi 2025-09-07T07:51:01.7108623Z fi 2025-09-07T07:51:01.7108830Z  2025-09-07T07:51:01.7109088Z if grep -r "${PT_DOMAIN}" /etc/hosts; then 2025-09-07T07:51:01.7109463Z  # Clean up any old records first 2025-09-07T07:51:01.7109832Z  sudo sed -i "/${PT_DOMAIN}/d" /etc/hosts 2025-09-07T07:51:01.7110162Z fi 2025-09-07T07:51:01.7110370Z  2025-09-07T07:51:01.7110678Z echo "${RESOLVED_IP} ${PT_DOMAIN}" | sudo tee -a /etc/hosts 2025-09-07T07:51:01.7111074Z cat /etc/hosts 2025-09-07T07:51:01.7123492Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:01.7124170Z env: 2025-09-07T07:51:01.7124545Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.7124866Z ##[endgroup] 2025-09-07T07:51:01.7216883Z + PT_DOMAIN=download.pytorch.org 2025-09-07T07:51:01.7222821Z ++ dig -4 +short download.pytorch.org 2025-09-07T07:51:01.7223468Z ++ tail -n1 2025-09-07T07:51:01.7335014Z + RESOLVED_IP=18.160.10.22 2025-09-07T07:51:01.7335485Z + '[' -z 18.160.10.22 ']' 2025-09-07T07:51:01.7335933Z + grep -r download.pytorch.org /etc/hosts 2025-09-07T07:51:01.7348957Z + echo '18.160.10.22 download.pytorch.org' 2025-09-07T07:51:01.7350388Z + sudo tee -a /etc/hosts 2025-09-07T07:51:01.7414375Z 18.160.10.22 download.pytorch.org 2025-09-07T07:51:01.7421439Z + cat /etc/hosts 2025-09-07T07:51:01.7431138Z 127.0.0.1 localhost 2025-09-07T07:51:01.7435851Z ::1 localhost ip6-localhost ip6-loopback 2025-09-07T07:51:01.7436213Z fe00:: ip6-localnet 2025-09-07T07:51:01.7436476Z ff00:: ip6-mcastprefix 2025-09-07T07:51:01.7436731Z ff02::1 ip6-allnodes 2025-09-07T07:51:01.7436966Z ff02::2 ip6-allrouters 2025-09-07T07:51:01.7437216Z 172.17.0.2 15a98ee0aa9d 2025-09-07T07:51:01.7437482Z 18.160.10.22 download.pytorch.org 2025-09-07T07:51:01.7470463Z ##[group]Run set +x 2025-09-07T07:51:01.7470816Z set +x 2025-09-07T07:51:01.7471053Z  2025-09-07T07:51:01.7471371Z max_attempts=30 2025-09-07T07:51:01.7471624Z delay=10 2025-09-07T07:51:01.7471915Z attempt=1 2025-09-07T07:51:01.7472157Z  2025-09-07T07:51:01.7472422Z for attempt in $(seq 1 $max_attempts); do 2025-09-07T07:51:01.7472966Z  echo "Attempt $attempt of $max_attempts: Checking if Docker daemon is running..." 2025-09-07T07:51:01.7473819Z  if docker info > /dev/null 2>&1; then 2025-09-07T07:51:01.7474278Z  echo "Docker is running. Proceeding with the next steps" 2025-09-07T07:51:01.7474680Z  exit 0 2025-09-07T07:51:01.7474907Z  else 2025-09-07T07:51:01.7475176Z  echo "Docker is not running yet." 2025-09-07T07:51:01.7475542Z  echo "Retrying in $delay seconds..." 2025-09-07T07:51:01.7475888Z  sleep $delay 2025-09-07T07:51:01.7476309Z  fi 2025-09-07T07:51:01.7476690Z done 2025-09-07T07:51:01.7477283Z echo "Reached maximum attempts to connect to Docker. Exiting." 2025-09-07T07:51:01.7478066Z exit 1 2025-09-07T07:51:01.7494118Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:01.7494515Z env: 2025-09-07T07:51:01.7494737Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.7495005Z ##[endgroup] 2025-09-07T07:51:01.7620363Z Attempt 1 of 30: Checking if Docker daemon is running... 2025-09-07T07:51:01.8047188Z Docker is running. Proceeding with the next steps 2025-09-07T07:51:01.8306381Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-09-07T07:51:01.8306876Z with: 2025-09-07T07:51:01.8307824Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8308872Z use-custom-docker-registry: true 2025-09-07T07:51:01.8309182Z docker-build-dir: .ci/docker 2025-09-07T07:51:01.8309480Z docker-build-script: ./build.sh 2025-09-07T07:51:01.8309782Z working-directory: . 2025-09-07T07:51:01.8310257Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:01.8310980Z force-push: false 2025-09-07T07:51:01.8311366Z env: 2025-09-07T07:51:01.8311597Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.8311860Z ##[endgroup] 2025-09-07T07:51:01.8602306Z ##[group]Run set -ex 2025-09-07T07:51:01.8602990Z set -ex 2025-09-07T07:51:01.8603340Z  2025-09-07T07:51:01.8603828Z # If the docker build directory or the build script doesn't exist, the action will 2025-09-07T07:51:01.8604525Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-09-07T07:51:01.8605102Z # job could then download the pre-built image as usual 2025-09-07T07:51:01.8605821Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-09-07T07:51:01.8606498Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8606841Z else 2025-09-07T07:51:01.8607105Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8607557Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8607978Z  2025-09-07T07:51:01.8608566Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-09-07T07:51:01.8609220Z  exit 0 2025-09-07T07:51:01.8609439Z fi 2025-09-07T07:51:01.8609654Z  2025-09-07T07:51:01.8610003Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-09-07T07:51:01.8610630Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-09-07T07:51:01.8611187Z  # use it as it is, but first let's extract the tag 2025-09-07T07:51:01.8611672Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-09-07T07:51:01.8612198Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8612700Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8613119Z else 2025-09-07T07:51:01.8613391Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-09-07T07:51:01.8613992Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-09-07T07:51:01.8614411Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-09-07T07:51:01.8614761Z  fi 2025-09-07T07:51:01.8615229Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-09-07T07:51:01.8615857Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8616519Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8617252Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8617774Z fi 2025-09-07T07:51:01.8655000Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:01.8655702Z env: 2025-09-07T07:51:01.8656074Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.8656563Z REPO_NAME: pytorch 2025-09-07T07:51:01.8658763Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8660719Z DOCKER_BUILD_DIR: .ci/docker 2025-09-07T07:51:01.8661230Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-09-07T07:51:01.8661921Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:01.8662693Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-09-07T07:51:01.8663240Z CUSTOM_TAG_PREFIX: 2025-09-07T07:51:01.8663668Z ##[endgroup] 2025-09-07T07:51:01.8711797Z + [[ -d .ci/docker ]] 2025-09-07T07:51:01.8712167Z + [[ -f .ci/docker/./build.sh ]] 2025-09-07T07:51:01.8712498Z + [[ true == \t\r\u\e ]] 2025-09-07T07:51:01.8712775Z + echo skip=false 2025-09-07T07:51:01.8714014Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-09-07T07:51:01.8720325Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8721583Z ++ awk -F '[:,]' '{print $2}' 2025-09-07T07:51:01.8734480Z + DOCKER_TAG=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8735661Z + echo docker-tag=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8737303Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8853267Z ##[group]Run set +e 2025-09-07T07:51:01.8853774Z set +e 2025-09-07T07:51:01.8854013Z set -x 2025-09-07T07:51:01.8854255Z  2025-09-07T07:51:01.8854472Z login() { 2025-09-07T07:51:01.8855246Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-09-07T07:51:01.8855900Z } 2025-09-07T07:51:01.8856117Z  2025-09-07T07:51:01.8856329Z retry () { 2025-09-07T07:51:01.8856593Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-09-07T07:51:01.8856916Z } 2025-09-07T07:51:01.8857125Z  2025-09-07T07:51:01.8857359Z retry login "${DOCKER_REGISTRY}" 2025-09-07T07:51:01.8857758Z  2025-09-07T07:51:01.8858012Z START_TIME=$(date +%s) 2025-09-07T07:51:01.8858487Z # Wait up to 120 minutes 2025-09-07T07:51:01.8858967Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-09-07T07:51:01.8859463Z  # Check if image already exists, if it does then skip building it 2025-09-07T07:51:01.8859970Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-09-07T07:51:01.8860641Z  exit 0 2025-09-07T07:51:01.8861002Z  fi 2025-09-07T07:51:01.8861344Z  2025-09-07T07:51:01.8861742Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-09-07T07:51:01.8862545Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-09-07T07:51:01.8863520Z  # latter, it will wait for the Docker images to become available before continuing 2025-09-07T07:51:01.8864062Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-09-07T07:51:01.8864462Z  # It's a Docker build job, let's build the image 2025-09-07T07:51:01.8864818Z  break 2025-09-07T07:51:01.8865054Z  else 2025-09-07T07:51:01.8865461Z  # It's a regular build job, wait for the image to become available 2025-09-07T07:51:01.8866092Z  sleep 300 2025-09-07T07:51:01.8866495Z  fi 2025-09-07T07:51:01.8866722Z done 2025-09-07T07:51:01.8866942Z  2025-09-07T07:51:01.8867496Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-09-07T07:51:01.8868088Z # be empty. The default action would be to continue rebuild the image 2025-09-07T07:51:01.8868620Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-09-07T07:51:01.8869091Z  # if we're on the base branch then use the parent commit 2025-09-07T07:51:01.8869514Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-09-07T07:51:01.8869832Z else 2025-09-07T07:51:01.8870165Z  # otherwise we're on a PR, so use the most recent base commit 2025-09-07T07:51:01.8870938Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-09-07T07:51:01.8871323Z fi 2025-09-07T07:51:01.8871523Z  2025-09-07T07:51:01.8871761Z if [[ -z "${MERGE_BASE}" ]]; then 2025-09-07T07:51:01.8872136Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8872472Z  2025-09-07T07:51:01.8873023Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-09-07T07:51:01.8873902Z  exit 0 2025-09-07T07:51:01.8874201Z fi 2025-09-07T07:51:01.8874414Z  2025-09-07T07:51:01.8874727Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-09-07T07:51:01.8875804Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-09-07T07:51:01.8876790Z  exit 1 2025-09-07T07:51:01.8877133Z fi 2025-09-07T07:51:01.8877333Z  2025-09-07T07:51:01.8877708Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-09-07T07:51:01.8878395Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-09-07T07:51:01.8879244Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-09-07T07:51:01.8880344Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-09-07T07:51:01.8881367Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-09-07T07:51:01.8881842Z fi 2025-09-07T07:51:01.8882056Z  2025-09-07T07:51:01.8882319Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:51:01.8897514Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:01.8897904Z env: 2025-09-07T07:51:01.8898130Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:01.8898419Z DOCKER_BUILD_DIR: .ci/docker 2025-09-07T07:51:01.8898771Z BASE_REVISION: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:51:01.8899934Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8901884Z DOCKER_TAG: pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:01.8902975Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:01.8903617Z DOCKER_PUSH: 2025-09-07T07:51:01.8903985Z ##[endgroup] 2025-09-07T07:51:01.8980449Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:01.8981205Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:01.8983486Z + aws ecr get-login-password --region us-east-1 2025-09-07T07:51:01.8984823Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:02.6658849Z 2025-09-07T07:51:02.6659691Z WARNING! Your credentials are stored unencrypted in '/home/charlie/.docker/config.json'. 2025-09-07T07:51:02.6660467Z Configure a credential helper to remove this warning. See 2025-09-07T07:51:02.6660939Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:51:02.6661221Z 2025-09-07T07:51:02.6661329Z Login Succeeded 2025-09-07T07:51:02.6682463Z ++ date +%s 2025-09-07T07:51:02.6692731Z + START_TIME=1757231462 2025-09-07T07:51:02.6696207Z ++ date +%s 2025-09-07T07:51:02.6703785Z + [[ 1757224262 -lt 1757231462 ]] 2025-09-07T07:51:02.6705160Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:02.8858102Z { 2025-09-07T07:51:02.8858399Z "schemaVersion": 2, 2025-09-07T07:51:02.8858850Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-09-07T07:51:02.8859293Z "config": { 2025-09-07T07:51:02.8859634Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-09-07T07:51:02.8860051Z "size": 31375, 2025-09-07T07:51:02.8860475Z "digest": "sha256:29d1d8a31b215537637bab7c99e18c255840b899cf7023e4e3cb5efa3270aef8" 2025-09-07T07:51:02.8860966Z }, 2025-09-07T07:51:02.8861166Z "layers": [ 2025-09-07T07:51:02.8861378Z { 2025-09-07T07:51:02.8861724Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8862141Z "size": 30448359, 2025-09-07T07:51:02.8862579Z "digest": "sha256:e6fdc8487bfe6d764301ef3634bc6c043841dc3ab05ca14f81e69c0f92562d46" 2025-09-07T07:51:02.8863071Z }, 2025-09-07T07:51:02.8863262Z { 2025-09-07T07:51:02.8863579Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8864001Z "size": 1554, 2025-09-07T07:51:02.8864418Z "digest": "sha256:171dcef20c49de4bc9268f60e02f111b72c638b0f24c3c5636c5013029db6d30" 2025-09-07T07:51:02.8864899Z }, 2025-09-07T07:51:02.8865078Z { 2025-09-07T07:51:02.8865408Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8865832Z "size": 313297922, 2025-09-07T07:51:02.8866270Z "digest": "sha256:4c92b3f72f1df31fe9f487fc1c27fcf1ba475ffb43abd69056306d1247786e40" 2025-09-07T07:51:02.8866754Z }, 2025-09-07T07:51:02.8866948Z { 2025-09-07T07:51:02.8867378Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8867807Z "size": 792, 2025-09-07T07:51:02.8868211Z "digest": "sha256:744f9ba90a6582eb601b3c20409bb10d6dad635dd118c3975f79721f4c82747c" 2025-09-07T07:51:02.8868690Z }, 2025-09-07T07:51:02.8868885Z { 2025-09-07T07:51:02.8869216Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8869623Z "size": 106, 2025-09-07T07:51:02.8870029Z "digest": "sha256:d3c08322a3326e45849dd80264a047c4f42ba4a2419d35c919542e2890e23934" 2025-09-07T07:51:02.8870499Z }, 2025-09-07T07:51:02.8870693Z { 2025-09-07T07:51:02.8871008Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:02.8871435Z "size": 704, 2025-09-07T07:51:02.8871853Z "digest": "sha256:ffd43b71f3ccf3ba563606231cb1d191eb9dd0052f422d54835e6af350525170" 2025-09-07T07:51:02.8872338Z }, 2025-09-07T07:51:03.0197567Z { 2025-09-07T07:51:03.0198072Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0198806Z "size": 1215, 2025-09-07T07:51:03.0199526Z "digest": "sha256:830692b57f6e2758398ec80c3b67a20441d12696b54ed14f2ecebf926198f7d6" 2025-09-07T07:51:03.0200356Z }, 2025-09-07T07:51:03.0200691Z { 2025-09-07T07:51:03.0201252Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0202017Z "size": 482, 2025-09-07T07:51:03.0202727Z "digest": "sha256:5bad36d184686719399be50830a98939d7dbda2313fb407df5915217483fc6a3" 2025-09-07T07:51:03.0204095Z }, 2025-09-07T07:51:03.0204411Z { 2025-09-07T07:51:03.0204965Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0205697Z "size": 110343614, 2025-09-07T07:51:03.0206305Z "digest": "sha256:0e34fdd9ac5c39eb0a9d2c2d258b26f42bb79d7dc0a22014bf201daa2e033eb4" 2025-09-07T07:51:03.0206868Z }, 2025-09-07T07:51:03.0207061Z { 2025-09-07T07:51:03.0207379Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0207819Z "size": 4786, 2025-09-07T07:51:03.0238128Z "digest": "sha256:3c868a62868ef54f82ac11be8dabe1b4365d000bacfe4c104e08022fc96dd767" 2025-09-07T07:51:03.0239126Z }, 2025-09-07T07:51:03.0239447Z { 2025-09-07T07:51:03.0240022Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0240791Z "size": 1710, 2025-09-07T07:51:03.0241528Z "digest": "sha256:62170a22dd571d55ffccac64c0be17f4006d2498cfbf7c6289325f0899cba005" 2025-09-07T07:51:03.0242362Z }, 2025-09-07T07:51:03.0242698Z { 2025-09-07T07:51:03.0243397Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0244166Z "size": 724, 2025-09-07T07:51:03.0244883Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:51:03.0245725Z }, 2025-09-07T07:51:03.0246019Z { 2025-09-07T07:51:03.0246600Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0247302Z "size": 543, 2025-09-07T07:51:03.0248048Z "digest": "sha256:9408d557a804a7dce00897e03ce9f4f447281eb38ce4bc331098a1f1a5ff0d30" 2025-09-07T07:51:03.0248916Z }, 2025-09-07T07:51:03.0249239Z { 2025-09-07T07:51:03.0249796Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0250531Z "size": 3241148049, 2025-09-07T07:51:03.0251307Z "digest": "sha256:df607cfc7c07db6d442e0274e2be8cdc507df8716717363aa92f2fea069bdd9a" 2025-09-07T07:51:03.0252203Z }, 2025-09-07T07:51:03.0252526Z { 2025-09-07T07:51:03.0253108Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0253857Z "size": 32, 2025-09-07T07:51:03.0254569Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0255446Z }, 2025-09-07T07:51:03.0255781Z { 2025-09-07T07:51:03.0256365Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0257097Z "size": 380, 2025-09-07T07:51:03.0257765Z "digest": "sha256:40a8e39faeda9f5273ff5014b2ef7d1ffeeef321de234186a705b1e0574326d2" 2025-09-07T07:51:03.0258544Z }, 2025-09-07T07:51:03.0258887Z { 2025-09-07T07:51:03.0259454Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0260165Z "size": 53548049, 2025-09-07T07:51:03.0260894Z "digest": "sha256:d895771c9faca390d7270f8c9c832b1428128c31ba6760b837d64b7e5920373f" 2025-09-07T07:51:03.0261757Z }, 2025-09-07T07:51:03.0262092Z { 2025-09-07T07:51:03.0262622Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0263311Z "size": 232, 2025-09-07T07:51:03.0263863Z "digest": "sha256:c4ee04f39d49efb46e52443e60c7f41832ea708d9bc5bf76c6d740895c66f57a" 2025-09-07T07:51:03.0264425Z }, 2025-09-07T07:51:03.0264687Z { 2025-09-07T07:51:03.0265072Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0265582Z "size": 3403403, 2025-09-07T07:51:03.0266324Z "digest": "sha256:3690c9826e48ed74e21e494d9d78990902abbc68795d002260ce71bff9a2cb3b" 2025-09-07T07:51:03.0266815Z }, 2025-09-07T07:51:03.0266994Z { 2025-09-07T07:51:03.0267399Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0268032Z "size": 1478, 2025-09-07T07:51:03.0268497Z "digest": "sha256:57cbc5013733eedfdf176b6db4b44458e826e1f64c0ef38849e9d77addc88936" 2025-09-07T07:51:03.0269217Z }, 2025-09-07T07:51:03.0269530Z { 2025-09-07T07:51:03.0270083Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0270818Z "size": 482, 2025-09-07T07:51:03.0271608Z "digest": "sha256:f5f4b06b58bbe4201d8b2eb5b0c6c1299f2725dd59e71cc45ef76ad89bba4deb" 2025-09-07T07:51:03.0272416Z }, 2025-09-07T07:51:03.0272749Z { 2025-09-07T07:51:03.0273319Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0274101Z "size": 197, 2025-09-07T07:51:03.0274785Z "digest": "sha256:f59713ce4bf491fe1f663d90e3b32d2290a7d8a4a0e8e13301e3bdb10b949f8e" 2025-09-07T07:51:03.0275686Z }, 2025-09-07T07:51:03.0276021Z { 2025-09-07T07:51:03.0276853Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0277601Z "size": 608, 2025-09-07T07:51:03.0278330Z "digest": "sha256:fe0486521517e626cae4fcbd9c83eb3956aad3ab0f833becee187b830891417b" 2025-09-07T07:51:03.0279175Z }, 2025-09-07T07:51:03.0279519Z { 2025-09-07T07:51:03.0280074Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0280850Z "size": 7874747615, 2025-09-07T07:51:03.0281423Z "digest": "sha256:8c21cc3715a2d715295f0299d8d2443262a3ae8defc1921f3226a0a24fc9c8fe" 2025-09-07T07:51:03.0282009Z }, 2025-09-07T07:51:03.0282297Z { 2025-09-07T07:51:03.0283051Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0283758Z "size": 829, 2025-09-07T07:51:03.0284483Z "digest": "sha256:d37c58456a6a4aa45d78abdb95553b3de0c79d941e18dc757c2c39fd59819739" 2025-09-07T07:51:03.0285350Z }, 2025-09-07T07:51:03.0285665Z { 2025-09-07T07:51:03.0286238Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0286999Z "size": 36688200, 2025-09-07T07:51:03.0287759Z "digest": "sha256:d042f63abc13891184a9d8e0dcdfae9a0daa140dea919fd319f12dcab5c684eb" 2025-09-07T07:51:03.0288632Z }, 2025-09-07T07:51:03.0288965Z { 2025-09-07T07:51:03.0289519Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0290053Z "size": 104, 2025-09-07T07:51:03.0290537Z "digest": "sha256:621284a9c05a47131a59226f6847b5b76ad211908278c1bdb990029d42259941" 2025-09-07T07:51:03.0291073Z }, 2025-09-07T07:51:03.0291266Z { 2025-09-07T07:51:03.0291607Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0292064Z "size": 1496, 2025-09-07T07:51:03.0292609Z "digest": "sha256:85f605d2dd3a8378567d3d974f0ec4694ef5fd988b25aca5d9aebd7c9b9ff018" 2025-09-07T07:51:03.0293320Z }, 2025-09-07T07:51:03.0293504Z { 2025-09-07T07:51:03.0294021Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0294620Z "size": 454406172, 2025-09-07T07:51:03.0295303Z "digest": "sha256:381b5539e5981dc994e71ab212f50135c32128fe1cc35d78bc386da6dffe1d51" 2025-09-07T07:51:03.0296087Z }, 2025-09-07T07:51:03.0296423Z { 2025-09-07T07:51:03.0296964Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0297814Z "size": 162, 2025-09-07T07:51:03.0298536Z "digest": "sha256:a487c0c800295407a4c7ab88c5b9e891b8b6aab9e35e62994d124369fcd7ba87" 2025-09-07T07:51:03.0299378Z }, 2025-09-07T07:51:03.0299712Z { 2025-09-07T07:51:03.0300257Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0301015Z "size": 346, 2025-09-07T07:51:03.0301728Z "digest": "sha256:48bcb81e256634f4132369d8bac738d9d622b010e5802e5292f565edba9035df" 2025-09-07T07:51:03.0302610Z }, 2025-09-07T07:51:03.0303273Z { 2025-09-07T07:51:03.0303847Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0304613Z "size": 32, 2025-09-07T07:51:03.0305335Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0306213Z }, 2025-09-07T07:51:03.0306537Z { 2025-09-07T07:51:03.0307096Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0307847Z "size": 106, 2025-09-07T07:51:03.0308593Z "digest": "sha256:e261928c0043c734790a38fa9ebf1bf8674801fa2f5051c3d2eac04e0f02b743" 2025-09-07T07:51:03.0309416Z }, 2025-09-07T07:51:03.0309755Z { 2025-09-07T07:51:03.0310330Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0311078Z "size": 425, 2025-09-07T07:51:03.0311807Z "digest": "sha256:0fea55428091bc98d5c48986120dd1da50b9b6cbd507408b2cdebdbe455e272e" 2025-09-07T07:51:03.0312699Z }, 2025-09-07T07:51:03.0313039Z { 2025-09-07T07:51:03.0313627Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0314309Z "size": 20224775, 2025-09-07T07:51:03.0315348Z "digest": "sha256:b4291bccbb8428a38187cd286fef7c24bd4863c7872c4d1cf96404ec1a69b321" 2025-09-07T07:51:03.0316221Z }, 2025-09-07T07:51:03.0316559Z { 2025-09-07T07:51:03.0317119Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0317705Z "size": 108, 2025-09-07T07:51:03.0318250Z "digest": "sha256:ddc91b09189afc218499daee92ebc22c6deefb22ee115c52c07627ecbaf7b9d5" 2025-09-07T07:51:03.0318832Z }, 2025-09-07T07:51:03.0319106Z { 2025-09-07T07:51:03.0319688Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0320399Z "size": 640, 2025-09-07T07:51:03.0321132Z "digest": "sha256:7540c74286279d1d6a29cdb51d3421e64860c6af74ca4a95736725c0509791ed" 2025-09-07T07:51:03.0321953Z }, 2025-09-07T07:51:03.0322280Z { 2025-09-07T07:51:03.0323033Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0323779Z "size": 724, 2025-09-07T07:51:03.0324525Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:51:03.0325406Z }, 2025-09-07T07:51:03.0325740Z { 2025-09-07T07:51:03.0326319Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0327037Z "size": 149, 2025-09-07T07:51:03.0327758Z "digest": "sha256:003c4e2598fb39f97ec7734271e034a48a3956a58429c9d06601770c2c40de11" 2025-09-07T07:51:03.0328338Z }, 2025-09-07T07:51:03.0328583Z { 2025-09-07T07:51:03.0328957Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0329383Z "size": 135, 2025-09-07T07:51:03.0329830Z "digest": "sha256:5687149362ae68fa2aa7d4ecd39fbf7ea86c0f6ced36a71f3c59f68f6c465cfc" 2025-09-07T07:51:03.0330421Z }, 2025-09-07T07:51:03.0330710Z { 2025-09-07T07:51:03.0331091Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0331687Z "size": 141, 2025-09-07T07:51:03.0332421Z "digest": "sha256:cdd2cf54eb2a3d8d034aa1556c9724d240b06397ba08f8b13b0bed6d65755aeb" 2025-09-07T07:51:03.0333291Z }, 2025-09-07T07:51:03.0333593Z { 2025-09-07T07:51:03.0334184Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0334920Z "size": 18615922074, 2025-09-07T07:51:03.0335653Z "digest": "sha256:d3ad4df1ba3a86ef1f84c427aae440ff027d483949d48eec4be6135260668cad" 2025-09-07T07:51:03.0336520Z }, 2025-09-07T07:51:03.0336860Z { 2025-09-07T07:51:03.0337511Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0338247Z "size": 223, 2025-09-07T07:51:03.0338977Z "digest": "sha256:3c9055753b4c79d74c707a91d8626ce10bc439129ba10dad3ebc643d9d4955dd" 2025-09-07T07:51:03.0339858Z }, 2025-09-07T07:51:03.0340192Z { 2025-09-07T07:51:03.0340754Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0341333Z "size": 353035275, 2025-09-07T07:51:03.0559711Z "digest": "sha256:31cf8d0bd21c76ae21f73d8b19b30949d161a498354f54191b4e5a294e929701" 2025-09-07T07:51:03.0560316Z }, 2025-09-07T07:51:03.0560511Z { 2025-09-07T07:51:03.0560832Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0561311Z "size": 6523020957, 2025-09-07T07:51:03.0561746Z "digest": "sha256:6623ea81497183b62e034e4ea8df8bf00fa75aaa192eea2821b2dd8655383b8f" 2025-09-07T07:51:03.0562628Z }, 2025-09-07T07:51:03.0563080Z { 2025-09-07T07:51:03.0563667Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0564416Z "size": 129, 2025-09-07T07:51:03.0565132Z "digest": "sha256:11696c3aa3808236d49256bc170b49d55cf657e499592b39b4856f6137220f55" 2025-09-07T07:51:03.0565950Z }, 2025-09-07T07:51:03.0566281Z { 2025-09-07T07:51:03.0566862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0567616Z "size": 778, 2025-09-07T07:51:03.0568343Z "digest": "sha256:ef4d544e35cacc73a229bcbc7a5510f8b156c7b3041f19f3a274562cd97cfd94" 2025-09-07T07:51:03.0569071Z }, 2025-09-07T07:51:03.0569267Z { 2025-09-07T07:51:03.0570029Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0570736Z "size": 724, 2025-09-07T07:51:03.0571472Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:51:03.0572116Z }, 2025-09-07T07:51:03.0572311Z { 2025-09-07T07:51:03.0572839Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0573580Z "size": 141, 2025-09-07T07:51:03.0574319Z "digest": "sha256:5c5108865e5e293209ae9bae8a29645035242e7e4b4433208a777496fddc988c" 2025-09-07T07:51:03.0574988Z }, 2025-09-07T07:51:03.0575171Z { 2025-09-07T07:51:03.0575602Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0576044Z "size": 32, 2025-09-07T07:51:03.0576643Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0577256Z }, 2025-09-07T07:51:03.0577601Z { 2025-09-07T07:51:03.0578026Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0578502Z "size": 159, 2025-09-07T07:51:03.0578984Z "digest": "sha256:9e97578e9edf1a11187740a5aa102633331fb6a714d0ed48683782de5a36fbd8" 2025-09-07T07:51:03.0579462Z }, 2025-09-07T07:51:03.0579699Z { 2025-09-07T07:51:03.0580116Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0580581Z "size": 1012, 2025-09-07T07:51:03.0581124Z "digest": "sha256:da5a91b54cb51f851560992645bc203f2287d9b1d7a4f04f7f4ea7efe45036ce" 2025-09-07T07:51:03.0581975Z }, 2025-09-07T07:51:03.0582288Z { 2025-09-07T07:51:03.0582859Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0583583Z "size": 724, 2025-09-07T07:51:03.0584295Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:51:03.0585195Z }, 2025-09-07T07:51:03.0585491Z { 2025-09-07T07:51:03.0586084Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0586808Z "size": 135, 2025-09-07T07:51:03.0587542Z "digest": "sha256:1e93be219e89e7733b91ba7e3af1a44d985e84959f732ecd5f5ca61bd13b5d41" 2025-09-07T07:51:03.0588403Z }, 2025-09-07T07:51:03.0588726Z { 2025-09-07T07:51:03.0589307Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0590047Z "size": 32, 2025-09-07T07:51:03.0590748Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0591633Z }, 2025-09-07T07:51:03.0591935Z { 2025-09-07T07:51:03.0592424Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0593104Z "size": 158, 2025-09-07T07:51:03.0593771Z "digest": "sha256:136825afebb533ee295f0d2523595281086c6410c60d5f712b84cefd24cb31d5" 2025-09-07T07:51:03.0594256Z }, 2025-09-07T07:51:03.0594451Z { 2025-09-07T07:51:03.0595143Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0595867Z "size": 1368, 2025-09-07T07:51:03.0596615Z "digest": "sha256:22b39805302d877e4c1ba433ebc36520438ea29a9ba8bc059efbcd9106f3a82d" 2025-09-07T07:51:03.0597289Z }, 2025-09-07T07:51:03.0597570Z { 2025-09-07T07:51:03.0597959Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0598383Z "size": 32, 2025-09-07T07:51:03.0598813Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0599331Z }, 2025-09-07T07:51:03.0599572Z { 2025-09-07T07:51:03.0599910Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0600338Z "size": 136, 2025-09-07T07:51:03.0600748Z "digest": "sha256:d12add675e3505e74eb9880eeef540ea0801282ca1ae01c3c221157cec91f5ae" 2025-09-07T07:51:03.0601237Z }, 2025-09-07T07:51:03.0601430Z { 2025-09-07T07:51:03.0601759Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0602173Z "size": 380, 2025-09-07T07:51:03.0602742Z "digest": "sha256:bc127046d33a7a98563698411b54ece8a167d520922879d7b69e8ca73a12d034" 2025-09-07T07:51:03.0603417Z }, 2025-09-07T07:51:03.0603616Z { 2025-09-07T07:51:03.0603935Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0604365Z "size": 32, 2025-09-07T07:51:03.0604781Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0605266Z }, 2025-09-07T07:51:03.0605449Z { 2025-09-07T07:51:03.0605774Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0606278Z "size": 104, 2025-09-07T07:51:03.0606685Z "digest": "sha256:951e8ce838415c4257680a9d60d216f3750cbb18d243d9a21e2008cce7e589cf" 2025-09-07T07:51:03.0607148Z }, 2025-09-07T07:51:03.0607343Z { 2025-09-07T07:51:03.0607669Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0608098Z "size": 408, 2025-09-07T07:51:03.0608505Z "digest": "sha256:32340b97ae50ba7b2918ab40d6f4a8db875afee69318f484e4deb0a1e2ec4beb" 2025-09-07T07:51:03.0608982Z }, 2025-09-07T07:51:03.0609170Z { 2025-09-07T07:51:03.0609495Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0609902Z "size": 32, 2025-09-07T07:51:03.0610315Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0610799Z }, 2025-09-07T07:51:03.0611010Z { 2025-09-07T07:51:03.0611368Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0611792Z "size": 109, 2025-09-07T07:51:03.0612216Z "digest": "sha256:5bbb04cd6b57ae13d7cf05ab9e9b4ed9752833ee2dba4eeaac47bde6022c4725" 2025-09-07T07:51:03.0612766Z }, 2025-09-07T07:51:03.0612946Z { 2025-09-07T07:51:03.0613329Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0613752Z "size": 1897, 2025-09-07T07:51:03.0614225Z "digest": "sha256:d8c4b845cfc7ca7cc0604f472bf6da8b1f1d4e98dff3c76e1985a7013a5b9e3f" 2025-09-07T07:51:03.0614764Z }, 2025-09-07T07:51:03.0615003Z { 2025-09-07T07:51:03.0615333Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0615900Z "size": 243440375, 2025-09-07T07:51:03.0616637Z "digest": "sha256:b35c180f4d8ddc2396eac4a6b893f438481a8163ceb0b88f203488bc5f2a8ba4" 2025-09-07T07:51:03.0617546Z }, 2025-09-07T07:51:03.0617875Z { 2025-09-07T07:51:03.0618387Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0618803Z "size": 106, 2025-09-07T07:51:03.0619211Z "digest": "sha256:5f967b3c303a99e609441551f7c8988cca4fd464c0c3127506bff8509583091b" 2025-09-07T07:51:03.0619686Z }, 2025-09-07T07:51:03.0619882Z { 2025-09-07T07:51:03.0620198Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0620624Z "size": 166, 2025-09-07T07:51:03.0621036Z "digest": "sha256:04770904f012e5584f1c19a0bc92d9863baaebf08bf75b4a9981f2b7795c8953" 2025-09-07T07:51:03.0621701Z }, 2025-09-07T07:51:03.0621887Z { 2025-09-07T07:51:03.0622217Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0622643Z "size": 7943, 2025-09-07T07:51:03.0623062Z "digest": "sha256:73373941fb321b4cb4a171b1423a68a4c7fedada3a1498868d7efe93cb03170e" 2025-09-07T07:51:03.0623529Z }, 2025-09-07T07:51:03.0623720Z { 2025-09-07T07:51:03.0624047Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0624469Z "size": 8072, 2025-09-07T07:51:03.0624870Z "digest": "sha256:9572e6cd907bfa4888456dbccc6e22146a0044374585f3fa0a8ced19b831ed62" 2025-09-07T07:51:03.0625350Z }, 2025-09-07T07:51:03.0625543Z { 2025-09-07T07:51:03.0625873Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0626279Z "size": 304, 2025-09-07T07:51:03.0626693Z "digest": "sha256:64a544aba233551e38898f138dd6ba3161ccdb9554e0ffb5b9d8f0f7fe4a7fa8" 2025-09-07T07:51:03.0627176Z }, 2025-09-07T07:51:03.0627369Z { 2025-09-07T07:51:03.0627813Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0628238Z "size": 13362696, 2025-09-07T07:51:03.0628655Z "digest": "sha256:7e35418a24997de5428763c93826679486760a1a9563209ae64de66ba45f99c1" 2025-09-07T07:51:03.0629122Z }, 2025-09-07T07:51:03.0629300Z { 2025-09-07T07:51:03.0629627Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0630048Z "size": 108, 2025-09-07T07:51:03.0630465Z "digest": "sha256:2ed8e82748d4a1131f41d9e41322f47a6ffef67a5a2b7bf5392237db5c035c61" 2025-09-07T07:51:03.0630932Z }, 2025-09-07T07:51:03.0631134Z { 2025-09-07T07:51:03.0631461Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0631881Z "size": 54145663, 2025-09-07T07:51:03.0632300Z "digest": "sha256:c988fbcccd708fb158a81c429d32e1060a7e40924fc3c987c629fa69d9484717" 2025-09-07T07:51:03.0632786Z }, 2025-09-07T07:51:03.0632978Z { 2025-09-07T07:51:03.0633309Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:51:03.0633717Z "size": 32, 2025-09-07T07:51:03.0634131Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:51:03.0634616Z } 2025-09-07T07:51:03.0634807Z ] 2025-09-07T07:51:03.0634988Z } 2025-09-07T07:51:03.0635213Z + exit 0 2025-09-07T07:51:03.0896234Z ##[group]Run set -eux 2025-09-07T07:51:03.0896530Z set -eux 2025-09-07T07:51:03.0896969Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-09-07T07:51:03.0898264Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-09-07T07:51:03.0914989Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:03.0915684Z env: 2025-09-07T07:51:03.0916070Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:03.0916524Z ##[endgroup] 2025-09-07T07:51:03.1190075Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-09-07T07:51:03.1190880Z + jq --raw-output .SecretString 2025-09-07T07:51:03.1192458Z + jq -r .docker_hub_readonly_token 2025-09-07T07:51:03.1194157Z + docker login --username pytorchbot --password-stdin 2025-09-07T07:51:03.9477851Z 2025-09-07T07:51:03.9478356Z Login Succeeded 2025-09-07T07:51:03.9478923Z WARNING! Your credentials are stored unencrypted in '/home/charlie/.docker/config.json'. 2025-09-07T07:51:03.9479606Z Configure a credential helper to remove this warning. See 2025-09-07T07:51:03.9480126Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:51:03.9480375Z 2025-09-07T07:51:03.9669742Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*:} 2025-09-07T07:51:03.9670505Z tag=${ECR_DOCKER_IMAGE##*:} 2025-09-07T07:51:03.9670990Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2025-09-07T07:51:03.9689008Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:03.9689523Z env: 2025-09-07T07:51:03.9689876Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:03.9691302Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:03.9692725Z ##[endgroup] 2025-09-07T07:51:04.1210109Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:04.1638536Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-09-07T07:51:04.1639320Z with: 2025-09-07T07:51:04.1640990Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:04.1642472Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:04.1643132Z env: 2025-09-07T07:51:04.1643355Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:04.1643609Z ##[endgroup] 2025-09-07T07:51:04.1756439Z ##[group]Run set -x 2025-09-07T07:51:04.1756913Z set -x 2025-09-07T07:51:04.1757313Z set +e 2025-09-07T07:51:04.1757706Z  2025-09-07T07:51:04.1758065Z login() { 2025-09-07T07:51:04.1758931Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-09-07T07:51:04.1759879Z } 2025-09-07T07:51:04.1760225Z  2025-09-07T07:51:04.1760660Z retry () { 2025-09-07T07:51:04.1761141Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-09-07T07:51:04.1761578Z } 2025-09-07T07:51:04.1761784Z  2025-09-07T07:51:04.1762086Z retry login "${DOCKER_REGISTRY}" 2025-09-07T07:51:04.1762398Z  2025-09-07T07:51:04.1763346Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-09-07T07:51:04.1764560Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-09-07T07:51:04.1765242Z  2025-09-07T07:51:04.1765603Z set -e 2025-09-07T07:51:04.1766208Z # ignore output since only exit code is used for conditional 2025-09-07T07:51:04.1767092Z # only pull docker image if it's not available locally 2025-09-07T07:51:04.1768072Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-09-07T07:51:04.1768997Z  retry docker pull "${DOCKER_IMAGE}" 2025-09-07T07:51:04.1769579Z fi 2025-09-07T07:51:04.1788584Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:51:04.1789280Z env: 2025-09-07T07:51:04.1789665Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:51:04.1791417Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:04.1793431Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:04.1794077Z ##[endgroup] 2025-09-07T07:51:04.1866948Z + set +e 2025-09-07T07:51:04.1867272Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:04.1867771Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:04.1871460Z + aws ecr get-login-password --region us-east-1 2025-09-07T07:51:04.1872833Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:51:04.9398660Z 2025-09-07T07:51:04.9399218Z WARNING! Your credentials are stored unencrypted in '/home/charlie/.docker/config.json'. 2025-09-07T07:51:04.9400097Z Configure a credential helper to remove this warning. See 2025-09-07T07:51:04.9400906Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:51:04.9401307Z 2025-09-07T07:51:04.9401483Z Login Succeeded 2025-09-07T07:51:04.9426004Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:04.9427337Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-09-07T07:51:05.1640883Z + IMAGE_SIZE=36183.606596946716 2025-09-07T07:51:05.1641325Z + echo 'Compressed size of image in MB: 36183.606596946716' 2025-09-07T07:51:05.1641724Z + set -e 2025-09-07T07:51:05.1650791Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:05.1652849Z Compressed size of image in MB: 36183.606596946716 2025-09-07T07:51:05.1757548Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:05.1759301Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:51:06.1011287Z pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77: Pulling from pytorch/ci-image 2025-09-07T07:51:06.1012350Z e6fdc8487bfe: Pulling fs layer 2025-09-07T07:51:06.1012673Z 171dcef20c49: Pulling fs layer 2025-09-07T07:51:06.1013046Z 4c92b3f72f1d: Pulling fs layer 2025-09-07T07:51:06.1013433Z 744f9ba90a65: Pulling fs layer 2025-09-07T07:51:06.1013742Z d3c08322a332: Pulling fs layer 2025-09-07T07:51:06.1014166Z ffd43b71f3cc: Pulling fs layer 2025-09-07T07:51:06.1014511Z 830692b57f6e: Pulling fs layer 2025-09-07T07:51:06.1014851Z 5bad36d18468: Pulling fs layer 2025-09-07T07:51:06.1015135Z 0e34fdd9ac5c: Pulling fs layer 2025-09-07T07:51:06.1015454Z 3c868a62868e: Pulling fs layer 2025-09-07T07:51:06.1015735Z 62170a22dd57: Pulling fs layer 2025-09-07T07:51:06.1016023Z 744f9ba90a65: Waiting 2025-09-07T07:51:06.1016260Z ffd43b71f3cc: Waiting 2025-09-07T07:51:06.1016615Z 553c1d23b6c4: Pulling fs layer 2025-09-07T07:51:06.1016892Z d3c08322a332: Waiting 2025-09-07T07:51:06.1017173Z 0e34fdd9ac5c: Waiting 2025-09-07T07:51:06.1017635Z 830692b57f6e: Waiting 2025-09-07T07:51:06.1018079Z 9408d557a804: Pulling fs layer 2025-09-07T07:51:06.1018542Z 5bad36d18468: Waiting 2025-09-07T07:51:06.1018960Z 3c868a62868e: Waiting 2025-09-07T07:51:06.1019363Z 62170a22dd57: Waiting 2025-09-07T07:51:06.1019757Z 553c1d23b6c4: Waiting 2025-09-07T07:51:06.1020201Z df607cfc7c07: Pulling fs layer 2025-09-07T07:51:06.1020713Z 9408d557a804: Waiting 2025-09-07T07:51:06.1021048Z 4f4fb700ef54: Pulling fs layer 2025-09-07T07:51:06.1021329Z df607cfc7c07: Waiting 2025-09-07T07:51:06.1021639Z 40a8e39faeda: Pulling fs layer 2025-09-07T07:51:06.1021932Z d895771c9fac: Pulling fs layer 2025-09-07T07:51:06.1022295Z c4ee04f39d49: Pulling fs layer 2025-09-07T07:51:06.1022596Z 3690c9826e48: Pulling fs layer 2025-09-07T07:51:06.1022919Z 4f4fb700ef54: Waiting 2025-09-07T07:51:06.1023162Z d895771c9fac: Waiting 2025-09-07T07:51:06.1023517Z c4ee04f39d49: Waiting 2025-09-07T07:51:06.1023807Z 40a8e39faeda: Waiting 2025-09-07T07:51:06.1024060Z 57cbc5013733: Pulling fs layer 2025-09-07T07:51:06.1024382Z 3690c9826e48: Waiting 2025-09-07T07:51:06.1024620Z f5f4b06b58bb: Pulling fs layer 2025-09-07T07:51:06.1024894Z 57cbc5013733: Waiting 2025-09-07T07:51:06.1025142Z f59713ce4bf4: Pulling fs layer 2025-09-07T07:51:06.1025427Z fe0486521517: Pulling fs layer 2025-09-07T07:51:06.1025698Z 8c21cc3715a2: Pulling fs layer 2025-09-07T07:51:06.1025996Z d37c58456a6a: Pulling fs layer 2025-09-07T07:51:06.1026269Z fe0486521517: Waiting 2025-09-07T07:51:06.1026509Z f5f4b06b58bb: Waiting 2025-09-07T07:51:06.1026736Z f59713ce4bf4: Waiting 2025-09-07T07:51:06.1026988Z d042f63abc13: Pulling fs layer 2025-09-07T07:51:06.1027259Z 8c21cc3715a2: Waiting 2025-09-07T07:51:06.1027544Z d37c58456a6a: Waiting 2025-09-07T07:51:06.1027780Z 621284a9c05a: Pulling fs layer 2025-09-07T07:51:06.1028434Z 85f605d2dd3a: Pulling fs layer 2025-09-07T07:51:06.1028718Z 381b5539e598: Pulling fs layer 2025-09-07T07:51:06.1028979Z d042f63abc13: Waiting 2025-09-07T07:51:06.1029221Z 85f605d2dd3a: Waiting 2025-09-07T07:51:06.1029473Z a487c0c80029: Pulling fs layer 2025-09-07T07:51:06.1029744Z 621284a9c05a: Waiting 2025-09-07T07:51:06.1029967Z 381b5539e598: Waiting 2025-09-07T07:51:06.1030214Z 48bcb81e2566: Pulling fs layer 2025-09-07T07:51:06.1030498Z e261928c0043: Pulling fs layer 2025-09-07T07:51:06.1030781Z 0fea55428091: Pulling fs layer 2025-09-07T07:51:06.1031250Z b4291bccbb84: Pulling fs layer 2025-09-07T07:51:06.1031533Z 48bcb81e2566: Waiting 2025-09-07T07:51:06.1031776Z e261928c0043: Waiting 2025-09-07T07:51:06.1032022Z 0fea55428091: Waiting 2025-09-07T07:51:06.1032465Z ddc91b09189a: Pulling fs layer 2025-09-07T07:51:06.1032955Z 7540c7428627: Pulling fs layer 2025-09-07T07:51:06.1033427Z 003c4e2598fb: Pulling fs layer 2025-09-07T07:51:06.1033917Z b4291bccbb84: Waiting 2025-09-07T07:51:06.1034262Z ddc91b09189a: Waiting 2025-09-07T07:51:06.1034489Z 7540c7428627: Waiting 2025-09-07T07:51:06.1034734Z 5687149362ae: Pulling fs layer 2025-09-07T07:51:06.1035038Z cdd2cf54eb2a: Pulling fs layer 2025-09-07T07:51:06.1035369Z 003c4e2598fb: Waiting 2025-09-07T07:51:06.1035612Z d3ad4df1ba3a: Pulling fs layer 2025-09-07T07:51:06.1036085Z cdd2cf54eb2a: Waiting 2025-09-07T07:51:06.1036492Z 5687149362ae: Waiting 2025-09-07T07:51:06.1036780Z 3c9055753b4c: Pulling fs layer 2025-09-07T07:51:06.1037043Z d3ad4df1ba3a: Waiting 2025-09-07T07:51:06.1037324Z 31cf8d0bd21c: Pulling fs layer 2025-09-07T07:51:06.1037633Z 6623ea814971: Pulling fs layer 2025-09-07T07:51:06.1037907Z 3c9055753b4c: Waiting 2025-09-07T07:51:06.1038137Z 31cf8d0bd21c: Waiting 2025-09-07T07:51:06.1038390Z 11696c3aa380: Pulling fs layer 2025-09-07T07:51:06.1038683Z ef4d544e35ca: Pulling fs layer 2025-09-07T07:51:06.1038944Z 6623ea814971: Waiting 2025-09-07T07:51:06.1039181Z 11696c3aa380: Waiting 2025-09-07T07:51:06.1039436Z 5c5108865e5e: Pulling fs layer 2025-09-07T07:51:06.1039708Z ef4d544e35ca: Waiting 2025-09-07T07:51:06.1039948Z 9e97578e9edf: Pulling fs layer 2025-09-07T07:51:06.1040220Z 5c5108865e5e: Waiting 2025-09-07T07:51:06.1040510Z da5a91b54cb5: Pulling fs layer 2025-09-07T07:51:06.1040786Z 9e97578e9edf: Waiting 2025-09-07T07:51:06.1041011Z da5a91b54cb5: Waiting 2025-09-07T07:51:06.1041260Z 1e93be219e89: Pulling fs layer 2025-09-07T07:51:06.1041568Z 1e93be219e89: Waiting 2025-09-07T07:51:06.1041824Z 136825afebb5: Pulling fs layer 2025-09-07T07:51:06.1042133Z 22b39805302d: Pulling fs layer 2025-09-07T07:51:06.1042410Z 136825afebb5: Waiting 2025-09-07T07:51:06.1042666Z d12add675e35: Pulling fs layer 2025-09-07T07:51:06.1043262Z 22b39805302d: Waiting 2025-09-07T07:51:06.1043502Z bc127046d33a: Pulling fs layer 2025-09-07T07:51:06.1043777Z d12add675e35: Waiting 2025-09-07T07:51:06.1044013Z bc127046d33a: Waiting 2025-09-07T07:51:06.1044259Z 951e8ce83841: Pulling fs layer 2025-09-07T07:51:06.1044528Z 32340b97ae50: Pulling fs layer 2025-09-07T07:51:06.1044802Z 951e8ce83841: Waiting 2025-09-07T07:51:06.1045054Z 5bbb04cd6b57: Pulling fs layer 2025-09-07T07:51:06.1045329Z 32340b97ae50: Waiting 2025-09-07T07:51:06.1045568Z d8c4b845cfc7: Pulling fs layer 2025-09-07T07:51:06.1045858Z b35c180f4d8d: Pulling fs layer 2025-09-07T07:51:06.1046133Z 5bbb04cd6b57: Waiting 2025-09-07T07:51:06.1046378Z d8c4b845cfc7: Waiting 2025-09-07T07:51:06.1046619Z 5f967b3c303a: Pulling fs layer 2025-09-07T07:51:06.1046902Z 04770904f012: Pulling fs layer 2025-09-07T07:51:06.1047182Z 73373941fb32: Pulling fs layer 2025-09-07T07:51:06.1047455Z 04770904f012: Waiting 2025-09-07T07:51:06.1047687Z 5f967b3c303a: Waiting 2025-09-07T07:51:06.1047922Z 73373941fb32: Waiting 2025-09-07T07:51:06.1048169Z 9572e6cd907b: Pulling fs layer 2025-09-07T07:51:06.1048453Z 64a544aba233: Pulling fs layer 2025-09-07T07:51:06.1048724Z 7e35418a2499: Pulling fs layer 2025-09-07T07:51:06.1048997Z 64a544aba233: Waiting 2025-09-07T07:51:06.1049235Z 9572e6cd907b: Waiting 2025-09-07T07:51:06.1049777Z 7e35418a2499: Waiting 2025-09-07T07:51:06.1050016Z 2ed8e82748d4: Pulling fs layer 2025-09-07T07:51:06.1050388Z c988fbcccd70: Pulling fs layer 2025-09-07T07:51:06.1050666Z c988fbcccd70: Waiting 2025-09-07T07:51:06.1050910Z 2ed8e82748d4: Waiting 2025-09-07T07:51:06.1874951Z 171dcef20c49: Verifying Checksum 2025-09-07T07:51:06.1875583Z 171dcef20c49: Download complete 2025-09-07T07:51:06.2807981Z 744f9ba90a65: Verifying Checksum 2025-09-07T07:51:06.2808379Z 744f9ba90a65: Download complete 2025-09-07T07:51:06.3994674Z d3c08322a332: Verifying Checksum 2025-09-07T07:51:06.3995079Z d3c08322a332: Download complete 2025-09-07T07:51:06.5125892Z ffd43b71f3cc: Verifying Checksum 2025-09-07T07:51:06.5126246Z ffd43b71f3cc: Download complete 2025-09-07T07:51:06.5130517Z e6fdc8487bfe: Verifying Checksum 2025-09-07T07:51:06.5130875Z e6fdc8487bfe: Download complete 2025-09-07T07:51:06.6409283Z 5bad36d18468: Verifying Checksum 2025-09-07T07:51:06.6409651Z 5bad36d18468: Download complete 2025-09-07T07:51:06.6454036Z 830692b57f6e: Verifying Checksum 2025-09-07T07:51:06.7473175Z 3c868a62868e: Verifying Checksum 2025-09-07T07:51:06.7473787Z 3c868a62868e: Download complete 2025-09-07T07:51:06.8448439Z 62170a22dd57: Verifying Checksum 2025-09-07T07:51:06.8448848Z 62170a22dd57: Download complete 2025-09-07T07:51:06.9592301Z 553c1d23b6c4: Verifying Checksum 2025-09-07T07:51:06.9592679Z 553c1d23b6c4: Download complete 2025-09-07T07:51:07.0508600Z 9408d557a804: Verifying Checksum 2025-09-07T07:51:07.0508982Z 9408d557a804: Download complete 2025-09-07T07:51:07.8647847Z 0e34fdd9ac5c: Verifying Checksum 2025-09-07T07:51:07.8648279Z 0e34fdd9ac5c: Download complete 2025-09-07T07:51:07.8734925Z 4f4fb700ef54: Verifying Checksum 2025-09-07T07:51:07.8735291Z 4f4fb700ef54: Download complete 2025-09-07T07:51:07.9730529Z 40a8e39faeda: Verifying Checksum 2025-09-07T07:51:07.9730914Z 40a8e39faeda: Download complete 2025-09-07T07:51:08.6231266Z d895771c9fac: Verifying Checksum 2025-09-07T07:51:08.6231701Z d895771c9fac: Download complete 2025-09-07T07:51:08.7023392Z c4ee04f39d49: Verifying Checksum 2025-09-07T07:51:08.7023783Z c4ee04f39d49: Download complete 2025-09-07T07:51:08.8882221Z 3690c9826e48: Verifying Checksum 2025-09-07T07:51:09.0055364Z 3690c9826e48: Download complete 2025-09-07T07:51:09.0055737Z 57cbc5013733: Verifying Checksum 2025-09-07T07:51:09.0056040Z 57cbc5013733: Download complete 2025-09-07T07:51:09.0919087Z f5f4b06b58bb: Download complete 2025-09-07T07:51:09.1972263Z f59713ce4bf4: Download complete 2025-09-07T07:51:09.2769420Z fe0486521517: Verifying Checksum 2025-09-07T07:51:09.2769813Z fe0486521517: Download complete 2025-09-07T07:51:09.4580739Z 4c92b3f72f1d: Verifying Checksum 2025-09-07T07:51:09.4581174Z 4c92b3f72f1d: Download complete 2025-09-07T07:51:09.5876123Z e6fdc8487bfe: Pull complete 2025-09-07T07:51:09.6061437Z d37c58456a6a: Verifying Checksum 2025-09-07T07:51:09.6061766Z d37c58456a6a: Download complete 2025-09-07T07:51:10.0873134Z d042f63abc13: Verifying Checksum 2025-09-07T07:51:10.0873777Z d042f63abc13: Download complete 2025-09-07T07:51:10.1850482Z 621284a9c05a: Verifying Checksum 2025-09-07T07:51:10.1850863Z 621284a9c05a: Download complete 2025-09-07T07:51:10.2650154Z 85f605d2dd3a: Verifying Checksum 2025-09-07T07:51:10.2650528Z 85f605d2dd3a: Download complete 2025-09-07T07:51:10.6659129Z 171dcef20c49: Pull complete 2025-09-07T07:51:19.6574297Z 381b5539e598: Verifying Checksum 2025-09-07T07:51:19.6574694Z 381b5539e598: Download complete 2025-09-07T07:51:19.8804968Z a487c0c80029: Verifying Checksum 2025-09-07T07:51:19.8805433Z a487c0c80029: Download complete 2025-09-07T07:51:20.0478279Z 48bcb81e2566: Verifying Checksum 2025-09-07T07:51:20.0478938Z 48bcb81e2566: Download complete 2025-09-07T07:51:20.1664504Z e261928c0043: Verifying Checksum 2025-09-07T07:51:20.1664962Z e261928c0043: Download complete 2025-09-07T07:51:20.3773844Z 0fea55428091: Verifying Checksum 2025-09-07T07:51:20.3774249Z 0fea55428091: Download complete 2025-09-07T07:51:20.8620896Z b4291bccbb84: Verifying Checksum 2025-09-07T07:51:20.8621344Z b4291bccbb84: Download complete 2025-09-07T07:51:21.0568779Z ddc91b09189a: Verifying Checksum 2025-09-07T07:51:21.0569313Z ddc91b09189a: Download complete 2025-09-07T07:51:21.1964443Z 7540c7428627: Verifying Checksum 2025-09-07T07:51:21.1964908Z 7540c7428627: Download complete 2025-09-07T07:51:21.3075011Z 003c4e2598fb: Verifying Checksum 2025-09-07T07:51:21.3075548Z 003c4e2598fb: Download complete 2025-09-07T07:51:21.5445891Z 5687149362ae: Verifying Checksum 2025-09-07T07:51:21.5446407Z 5687149362ae: Download complete 2025-09-07T07:51:21.6718309Z cdd2cf54eb2a: Verifying Checksum 2025-09-07T07:51:21.6719368Z cdd2cf54eb2a: Download complete 2025-09-07T07:51:33.5209743Z 4c92b3f72f1d: Pull complete 2025-09-07T07:51:34.9007747Z 744f9ba90a65: Pull complete 2025-09-07T07:51:36.1155747Z d3c08322a332: Pull complete 2025-09-07T07:51:37.2291755Z ffd43b71f3cc: Pull complete 2025-09-07T07:51:38.7136705Z 830692b57f6e: Pull complete 2025-09-07T07:51:40.2137114Z 5bad36d18468: Pull complete 2025-09-07T07:51:47.3063158Z 0e34fdd9ac5c: Pull complete 2025-09-07T07:51:48.2492104Z 3c868a62868e: Pull complete 2025-09-07T07:51:48.7508167Z 62170a22dd57: Pull complete 2025-09-07T07:51:50.0741494Z df607cfc7c07: Verifying Checksum 2025-09-07T07:51:50.0742123Z df607cfc7c07: Download complete 2025-09-07T07:51:50.1768645Z 3c9055753b4c: Verifying Checksum 2025-09-07T07:51:50.1769273Z 3c9055753b4c: Download complete 2025-09-07T07:51:50.3849090Z 553c1d23b6c4: Pull complete 2025-09-07T07:51:51.1754678Z 9408d557a804: Pull complete 2025-09-07T07:51:55.0597162Z 31cf8d0bd21c: Verifying Checksum 2025-09-07T07:51:55.0597606Z 31cf8d0bd21c: Download complete 2025-09-07T07:54:39.6654419Z 8c21cc3715a2: Verifying Checksum 2025-09-07T07:54:39.6654799Z 8c21cc3715a2: Download complete 2025-09-07T07:54:39.7370338Z 11696c3aa380: Verifying Checksum 2025-09-07T07:54:39.7370684Z 11696c3aa380: Download complete 2025-09-07T07:54:39.8048956Z ef4d544e35ca: Verifying Checksum 2025-09-07T07:54:39.8049342Z ef4d544e35ca: Download complete 2025-09-07T07:54:39.8917267Z 5c5108865e5e: Verifying Checksum 2025-09-07T07:54:39.8917667Z 5c5108865e5e: Download complete 2025-09-07T07:54:39.9674234Z 9e97578e9edf: Verifying Checksum 2025-09-07T07:54:39.9674617Z 9e97578e9edf: Download complete 2025-09-07T07:54:40.0474171Z da5a91b54cb5: Verifying Checksum 2025-09-07T07:54:40.0474566Z da5a91b54cb5: Download complete 2025-09-07T07:54:40.1250860Z 1e93be219e89: Verifying Checksum 2025-09-07T07:54:40.1251231Z 1e93be219e89: Download complete 2025-09-07T07:54:40.1805834Z 136825afebb5: Verifying Checksum 2025-09-07T07:54:40.1806188Z 136825afebb5: Download complete 2025-09-07T07:54:40.2439700Z 22b39805302d: Verifying Checksum 2025-09-07T07:54:40.2440086Z 22b39805302d: Download complete 2025-09-07T07:54:40.3245718Z d12add675e35: Verifying Checksum 2025-09-07T07:54:40.3246091Z d12add675e35: Download complete 2025-09-07T07:54:40.4092176Z bc127046d33a: Download complete 2025-09-07T07:54:40.4874787Z 951e8ce83841: Verifying Checksum 2025-09-07T07:54:40.4875137Z 951e8ce83841: Download complete 2025-09-07T07:54:40.6514818Z 32340b97ae50: Verifying Checksum 2025-09-07T07:54:40.6515161Z 32340b97ae50: Download complete 2025-09-07T07:54:40.7333462Z 5bbb04cd6b57: Verifying Checksum 2025-09-07T07:54:40.7333858Z 5bbb04cd6b57: Download complete 2025-09-07T07:54:40.8186974Z d8c4b845cfc7: Verifying Checksum 2025-09-07T07:54:40.8187390Z d8c4b845cfc7: Download complete 2025-09-07T07:55:00.2746366Z b35c180f4d8d: Verifying Checksum 2025-09-07T07:55:00.2746755Z b35c180f4d8d: Download complete 2025-09-07T07:55:00.3508080Z 5f967b3c303a: Verifying Checksum 2025-09-07T07:55:00.3508457Z 5f967b3c303a: Download complete 2025-09-07T07:55:00.4133598Z 04770904f012: Verifying Checksum 2025-09-07T07:55:00.4133923Z 04770904f012: Download complete 2025-09-07T07:55:00.4891748Z 73373941fb32: Download complete 2025-09-07T07:55:00.5745167Z 9572e6cd907b: Download complete 2025-09-07T07:55:00.6550010Z 64a544aba233: Verifying Checksum 2025-09-07T07:55:00.6550347Z 64a544aba233: Download complete 2025-09-07T07:55:02.1047623Z 7e35418a2499: Verifying Checksum 2025-09-07T07:55:02.1048370Z 7e35418a2499: Download complete 2025-09-07T07:55:02.1867105Z 2ed8e82748d4: Verifying Checksum 2025-09-07T07:55:02.1867431Z 2ed8e82748d4: Download complete 2025-09-07T07:55:08.0447560Z c988fbcccd70: Verifying Checksum 2025-09-07T07:55:08.0447958Z c988fbcccd70: Download complete 2025-09-07T07:59:39.1975989Z 6623ea814971: Verifying Checksum 2025-09-07T07:59:39.1976374Z 6623ea814971: Download complete 2025-09-07T08:09:07.8456217Z d3ad4df1ba3a: Verifying Checksum 2025-09-07T08:09:07.8456614Z d3ad4df1ba3a: Download complete 2025-09-07T08:09:40.1000328Z df607cfc7c07: Pull complete 2025-09-07T08:09:43.1434448Z 4f4fb700ef54: Pull complete 2025-09-07T08:09:45.6314876Z 40a8e39faeda: Pull complete 2025-09-07T08:09:50.2882118Z d895771c9fac: Pull complete 2025-09-07T08:09:52.9907269Z c4ee04f39d49: Pull complete 2025-09-07T08:09:55.4850164Z 3690c9826e48: Pull complete 2025-09-07T08:09:58.3450159Z 57cbc5013733: Pull complete 2025-09-07T08:10:01.4091746Z f5f4b06b58bb: Pull complete 2025-09-07T08:10:04.2304111Z f59713ce4bf4: Pull complete 2025-09-07T08:10:07.0809312Z fe0486521517: Pull complete 2025-09-07T08:20:01.8364061Z 8c21cc3715a2: Pull complete 2025-09-07T08:20:05.8750813Z d37c58456a6a: Pull complete 2025-09-07T08:20:12.1807545Z d042f63abc13: Pull complete 2025-09-07T08:20:16.1295245Z 621284a9c05a: Pull complete 2025-09-07T08:20:19.8450108Z 85f605d2dd3a: Pull complete 2025-09-07T08:20:47.5950170Z 381b5539e598: Pull complete 2025-09-07T08:20:50.4601158Z a487c0c80029: Pull complete 2025-09-07T08:20:53.7085313Z 48bcb81e2566: Pull complete 2025-09-07T08:21:00.1384224Z e261928c0043: Pull complete 2025-09-07T08:21:03.6239017Z 0fea55428091: Pull complete 2025-09-07T08:21:07.8157049Z b4291bccbb84: Pull complete 2025-09-07T08:21:11.3330156Z ddc91b09189a: Pull complete 2025-09-07T08:21:15.0591714Z 7540c7428627: Pull complete 2025-09-07T08:21:22.3156698Z 003c4e2598fb: Pull complete 2025-09-07T08:21:26.0083570Z 5687149362ae: Pull complete 2025-09-07T08:21:29.0105992Z cdd2cf54eb2a: Pull complete 2025-09-07T08:43:23.3852914Z d3ad4df1ba3a: Pull complete 2025-09-07T08:43:26.0437996Z 3c9055753b4c: Pull complete 2025-09-07T08:43:34.8089842Z 31cf8d0bd21c: Pull complete 2025-09-07T08:56:12.6053669Z 6623ea814971: Pull complete 2025-09-07T08:56:15.3523573Z 11696c3aa380: Pull complete 2025-09-07T08:56:17.6088295Z ef4d544e35ca: Pull complete 2025-09-07T08:56:23.2660999Z 5c5108865e5e: Pull complete 2025-09-07T08:56:28.8403808Z 9e97578e9edf: Pull complete 2025-09-07T08:56:31.6524595Z da5a91b54cb5: Pull complete 2025-09-07T08:56:37.0089017Z 1e93be219e89: Pull complete 2025-09-07T08:56:42.6010309Z 136825afebb5: Pull complete 2025-09-07T08:56:45.4769737Z 22b39805302d: Pull complete 2025-09-07T08:56:51.0622231Z d12add675e35: Pull complete 2025-09-07T08:56:53.8978977Z bc127046d33a: Pull complete 2025-09-07T08:56:59.0279638Z 951e8ce83841: Pull complete 2025-09-07T08:57:01.6943805Z 32340b97ae50: Pull complete 2025-09-07T08:57:07.4567214Z 5bbb04cd6b57: Pull complete 2025-09-07T08:57:10.0297690Z d8c4b845cfc7: Pull complete 2025-09-07T08:57:23.1849294Z b35c180f4d8d: Pull complete 2025-09-07T08:57:26.5205338Z 5f967b3c303a: Pull complete 2025-09-07T08:57:28.5871142Z 04770904f012: Pull complete 2025-09-07T08:57:31.5874073Z 73373941fb32: Pull complete 2025-09-07T08:57:34.8800415Z 9572e6cd907b: Pull complete 2025-09-07T08:57:37.3004716Z 64a544aba233: Pull complete 2025-09-07T08:57:40.6731200Z 7e35418a2499: Pull complete 2025-09-07T08:57:43.8825046Z 2ed8e82748d4: Pull complete 2025-09-07T08:57:49.6379570Z c988fbcccd70: Pull complete 2025-09-07T08:57:53.2240837Z Digest: sha256:f30843ff9ea9e117a2c8e6d207e85c9e77dfe682f1dfcdfea5b94178d1bf00b3 2025-09-07T08:57:53.9473105Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:57:54.0499258Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:57:54.0719495Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T08:57:54.0720537Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T08:57:54.0734880Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:57:54.0735302Z env: 2025-09-07T08:57:54.0735525Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:54.0735813Z ##[endgroup] 2025-09-07T08:57:54.1176260Z ##[group]Run echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" 2025-09-07T08:57:54.1176962Z echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" 2025-09-07T08:57:54.1188831Z shell: /usr/bin/bash -e {0} 2025-09-07T08:57:54.1189129Z env: 2025-09-07T08:57:54.1189359Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:54.1189636Z ##[endgroup] 2025-09-07T08:57:54.1730563Z ##[group]Run echo "SCCACHE_SERVER_PORT_DOCKER_FLAG=-e SCCACHE_SERVER_PORT=$((RUNNER_UID + 4226))" >> "${GITHUB_ENV}" 2025-09-07T08:57:54.1731456Z echo "SCCACHE_SERVER_PORT_DOCKER_FLAG=-e SCCACHE_SERVER_PORT=$((RUNNER_UID + 4226))" >> "${GITHUB_ENV}" 2025-09-07T08:57:54.1743108Z shell: /usr/bin/bash -e {0} 2025-09-07T08:57:54.1743411Z env: 2025-09-07T08:57:54.1743641Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:54.1743995Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:57:54.1744358Z ##[endgroup] 2025-09-07T08:57:54.1986396Z Prepare all required actions 2025-09-07T08:57:54.2181667Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-09-07T08:57:54.2182031Z with: 2025-09-07T08:57:54.2182476Z github-token: *** 2025-09-07T08:57:54.2182704Z env: 2025-09-07T08:57:54.2182925Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:54.2183275Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:57:54.2183752Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:57:54.2184154Z ##[endgroup] 2025-09-07T08:57:54.2750875Z ##[group]Run set -eux 2025-09-07T08:57:54.2751164Z set -eux 2025-09-07T08:57:54.2751623Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-09-07T08:57:54.2763277Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:57:54.2763669Z env: 2025-09-07T08:57:54.2763901Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:54.2764251Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:57:54.2764771Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:57:54.2765373Z GITHUB_TOKEN: *** 2025-09-07T08:57:54.2765625Z ##[endgroup] 2025-09-07T08:57:54.2976052Z + python3 .github/scripts/get_workflow_job_id.py 17525309334 i-03028b1668c838483-1003 2025-09-07T08:57:55.0609666Z Setting output job-id=49775768433 2025-09-07T08:57:55.0610297Z Setting output job-name=cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:57:55.0958558Z ##[group]Run python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-09-07T08:57:55.0959330Z python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-09-07T08:57:55.0960328Z python3 -m tools.stats.monitor --log-interval "$MONITOR_LOG_INTERVAL" --data-collect-interval "$MONITOR_DATA_COLLECT_INTERVAL" > usage_log.txt 2>&1 & 2025-09-07T08:57:55.0961204Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2025-09-07T08:57:55.0974088Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:57:55.0974490Z env: 2025-09-07T08:57:55.0974720Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:57:55.0975068Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:57:55.0975533Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:57:55.0975944Z JOB_ID: 49775768433 2025-09-07T08:57:55.0976460Z JOB_NAME: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:57:55.0977214Z WORKFLOW_NAME: inductor-A100-perf-nightly 2025-09-07T08:57:55.0977657Z WORKFLOW_RUN_ID: 17525309334 2025-09-07T08:57:55.0977959Z MONITOR_LOG_INTERVAL: 15 2025-09-07T08:57:55.0978255Z MONITOR_DATA_COLLECT_INTERVAL: 4 2025-09-07T08:57:55.0978561Z ##[endgroup] 2025-09-07T08:57:55.4344832Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T08:57:56.0864131Z Collecting psutil==5.9.8 2025-09-07T08:57:56.1379134Z Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) 2025-09-07T08:57:56.4554344Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.2/288.2 KB 883.3 kB/s eta 0:00:00 2025-09-07T08:57:56.9404668Z Collecting dataclasses_json==0.6.7 2025-09-07T08:57:56.9436049Z Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) 2025-09-07T08:57:57.7792637Z Collecting nvidia-ml-py==11.525.84 2025-09-07T08:57:57.7824420Z Downloading nvidia_ml_py-11.525.84-py3-none-any.whl (34 kB) 2025-09-07T08:57:58.5296611Z Collecting marshmallow<4.0.0,>=3.18.0 2025-09-07T08:57:58.5328019Z Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) 2025-09-07T08:57:58.9216463Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.9/50.9 KB 105.2 kB/s eta 0:00:00 2025-09-07T08:57:59.2713187Z Collecting typing-inspect<1,>=0.4.0 2025-09-07T08:57:59.2746895Z Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) 2025-09-07T08:58:00.0310980Z Collecting packaging>=17.0 2025-09-07T08:58:00.0341818Z Downloading packaging-25.0-py3-none-any.whl (66 kB) 2025-09-07T08:58:00.3424748Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.5/66.5 KB 183.6 kB/s eta 0:00:00 2025-09-07T08:58:00.7235553Z Collecting typing-extensions>=3.7.4 2025-09-07T08:58:00.7267078Z Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB) 2025-09-07T08:58:01.0891340Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 95.4 kB/s eta 0:00:00 2025-09-07T08:58:01.4851445Z Collecting mypy-extensions>=0.3.0 2025-09-07T08:58:01.4883549Z Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) 2025-09-07T08:58:01.8559290Z Installing collected packages: nvidia-ml-py, typing-extensions, psutil, packaging, mypy-extensions, typing-inspect, marshmallow, dataclasses_json 2025-09-07T08:58:06.2318964Z Successfully installed dataclasses_json-0.6.7 marshmallow-3.26.1 mypy-extensions-1.1.0 nvidia-ml-py-11.525.84 packaging-25.0 psutil-5.9.8 typing-extensions-4.15.0 typing-inspect-0.9.0 2025-09-07T08:58:06.3120716Z Prepare all required actions 2025-09-07T08:58:06.3121127Z Getting action download info 2025-09-07T08:58:06.4456627Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-09-07T08:58:08.3531968Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-09-07T08:58:12.4830834Z ##[group]Run ./.github/actions/download-build-artifacts 2025-09-07T08:58:12.4831247Z with: 2025-09-07T08:58:12.4831520Z name: linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T08:58:12.4831866Z s3-bucket: gha-artifacts 2025-09-07T08:58:12.4832149Z env: 2025-09-07T08:58:12.4832379Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:12.4832740Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:12.4833211Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:12.4833625Z ##[endgroup] 2025-09-07T08:58:12.5140276Z ##[group]Run seemethere/download-artifact-s3@v4 2025-09-07T08:58:12.5140640Z with: 2025-09-07T08:58:12.5140890Z name: linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T08:58:12.5141246Z s3-bucket: gha-artifacts 2025-09-07T08:58:12.5141532Z region: us-east-1 2025-09-07T08:58:12.5141754Z env: 2025-09-07T08:58:12.5141972Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:12.5142319Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:12.5142799Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:12.5143419Z ##[endgroup] 2025-09-07T08:58:12.9784984Z (node:11323) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-09-07T08:58:12.9785507Z 2025-09-07T08:58:12.9785715Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-09-07T08:58:12.9786283Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-09-07T08:58:12.9786861Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-09-07T08:58:13.0687387Z Found 1 objects with prefix pytorch/pytorch/17525309334/linux-jammy-cuda12.8-py3.10-gcc9-sm80/ 2025-09-07T08:58:13.0688088Z Starting download (1/1): /home/charlie/_work/pytorch/pytorch/artifacts.zip 2025-09-07T08:58:21.5791753Z Finished download (1/1): /home/charlie/_work/pytorch/pytorch/artifacts.zip 2025-09-07T08:58:21.5800373Z Artifact download has finished successfully 2025-09-07T08:58:21.6192459Z ##[group]Run unzip -o artifacts.zip 2025-09-07T08:58:21.6192825Z unzip -o artifacts.zip 2025-09-07T08:58:21.6205815Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:21.6206203Z env: 2025-09-07T08:58:21.6206430Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:21.6206773Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:21.6207243Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:21.6207643Z ##[endgroup] 2025-09-07T08:58:21.6428179Z Archive: artifacts.zip 2025-09-07T08:58:21.6429568Z creating: dist/ 2025-09-07T08:58:23.5868330Z inflating: dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl 2025-09-07T08:58:23.5868845Z creating: dist/vision/ 2025-09-07T08:58:23.5988010Z inflating: dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:58:23.5988538Z creating: dist/audio/ 2025-09-07T08:58:23.6045455Z inflating: dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T08:58:23.6045966Z creating: dist/torchrec/ 2025-09-07T08:58:23.6071884Z inflating: dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T08:58:23.6072312Z creating: dist/fbgemm_gpu/ 2025-09-07T08:58:24.5060151Z inflating: dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T08:58:24.5060693Z creating: dist/ao/ 2025-09-07T08:58:24.5102185Z inflating: dist/ao/torchao-0.7.0+git51c87b6e-py3-none-any.whl 2025-09-07T08:58:24.5233184Z inflating: dist/.ninja_log 2025-09-07T08:58:24.5233626Z creating: build/custom_test_artifacts/ 2025-09-07T08:58:24.5234468Z creating: build/custom_test_artifacts/custom-op-build/ 2025-09-07T08:58:24.5234995Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-09-07T08:58:24.5235593Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:58:24.5241854Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:58:24.5242554Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/ 2025-09-07T08:58:24.5243384Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:58:24.5244124Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:58:24.5244828Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:58:24.5245657Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:58:24.5246679Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:58:24.5247467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:58:24.5248230Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:58:24.5248962Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:58:24.5250077Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:58:24.5251257Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:58:24.5252076Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:58:24.5253426Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:58:24.5254686Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:58:24.5255525Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:58:24.5256260Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:58:24.5298375Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:58:24.5340928Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:58:24.5342036Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:58:24.5389798Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:58:24.5390883Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:58:24.5391960Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:58:24.5393090Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:58:24.5394177Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:58:24.5395238Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:58:24.5396303Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:58:24.5397546Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:58:24.5398573Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:58:24.5399551Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:58:24.5400499Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:58:24.5401425Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:58:24.5402368Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:58:24.5403378Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:58:24.5404300Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:58:24.5473812Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:58:24.5474627Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:58:24.5548828Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:58:24.5549636Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:58:24.5550436Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:58:24.5551103Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-09-07T08:58:24.5551811Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-09-07T08:58:24.5552581Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-09-07T08:58:24.5553469Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-09-07T08:58:24.5554299Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-09-07T08:58:24.5555090Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-09-07T08:58:24.5555900Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-09-07T08:58:24.5556722Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-09-07T08:58:24.5557549Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-09-07T08:58:24.5558351Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-09-07T08:58:24.5559156Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-09-07T08:58:24.5574089Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-09-07T08:58:24.5775801Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-09-07T08:58:24.5776594Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-09-07T08:58:24.5777467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-09-07T08:58:24.5778540Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-09-07T08:58:24.5779465Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-09-07T08:58:24.5780291Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-09-07T08:58:24.5781141Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-09-07T08:58:24.5782289Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-09-07T08:58:24.5783160Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-09-07T08:58:24.5784030Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-09-07T08:58:24.5784885Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-09-07T08:58:24.5800172Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-09-07T08:58:24.5880452Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-09-07T08:58:24.5881374Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:58:24.5882191Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:58:24.5883067Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-09-07T08:58:24.5883733Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-09-07T08:58:24.5884405Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-09-07T08:58:24.5885152Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/InstallScripts.json 2025-09-07T08:58:24.5885857Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2025-09-07T08:58:24.5886654Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-09-07T08:58:24.5887325Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-09-07T08:58:24.5887918Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-09-07T08:58:24.6060979Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-09-07T08:58:24.6117064Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-09-07T08:58:24.6117605Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-09-07T08:58:24.6118096Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-09-07T08:58:24.6118682Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:58:24.6125259Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:58:24.6125947Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/ 2025-09-07T08:58:24.6126613Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:58:24.6127340Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:58:24.6128041Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:58:24.6128855Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:58:24.6129696Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:58:24.6130474Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:58:24.6131208Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:58:24.6131933Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:58:24.6132961Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:58:24.6134155Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:58:24.6134963Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:58:24.6136418Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:58:24.6137600Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:58:24.6138421Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:58:24.6139152Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:58:24.6181863Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:58:24.6224368Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:58:24.6225456Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:58:24.6273095Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:58:24.6274164Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:58:24.6275240Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:58:24.6276344Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:58:24.6277415Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:58:24.6278605Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:58:24.6279646Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:58:24.6280696Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:58:24.6281713Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:58:24.6282676Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:58:24.6283795Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:58:24.6284695Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:58:24.6285621Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:58:24.6286523Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:58:24.6287425Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:58:24.6356191Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:58:24.6357003Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:58:24.6431020Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:58:24.6431833Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:58:24.6432475Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:58:24.6433133Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-09-07T08:58:24.6433839Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-09-07T08:58:24.6434619Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-09-07T08:58:24.6435669Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-09-07T08:58:24.6436547Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-09-07T08:58:24.6437354Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-09-07T08:58:24.6438192Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-09-07T08:58:24.6439022Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-09-07T08:58:24.6439875Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-09-07T08:58:24.6440710Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-09-07T08:58:24.6441522Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-09-07T08:58:24.6457338Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-09-07T08:58:24.6519458Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-09-07T08:58:24.6520342Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:58:24.6521145Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:58:24.6522007Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-09-07T08:58:24.6522671Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-09-07T08:58:24.6523430Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-09-07T08:58:24.6524119Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/InstallScripts.json 2025-09-07T08:58:24.6524810Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2025-09-07T08:58:24.6525578Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-09-07T08:58:24.6526292Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-09-07T08:58:24.6526877Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-09-07T08:58:24.6565024Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-09-07T08:58:24.6565579Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-09-07T08:58:24.6566131Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-09-07T08:58:24.6566781Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:58:24.6573081Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:58:24.6573812Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/ 2025-09-07T08:58:24.6574551Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:58:24.6575333Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:58:24.6576103Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:58:24.6576977Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:58:24.6577925Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:58:24.6578748Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:58:24.6579552Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:58:24.6580336Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:58:24.6581392Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:58:24.6582332Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:58:24.6583190Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:58:24.6584217Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:58:24.6585449Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:58:24.6586342Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:58:24.6587140Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:58:24.6628980Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:58:24.6671518Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:58:24.6672670Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:58:24.6720915Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:58:24.6722215Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:58:24.6723487Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:58:24.6724676Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:58:24.6725821Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:58:24.6726924Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:58:24.6728056Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:58:24.6729168Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:58:24.6730269Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:58:24.6731308Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:58:24.6732308Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:58:24.6733281Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:58:24.6734276Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:58:24.6735248Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:58:24.6736223Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:58:24.6804204Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:58:24.6805063Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:58:24.6879161Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:58:24.6880170Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:58:24.6880860Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:58:24.6881582Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-09-07T08:58:24.6882352Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-09-07T08:58:24.6883301Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-09-07T08:58:24.6884278Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-09-07T08:58:24.6885217Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-09-07T08:58:24.6886089Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-09-07T08:58:24.6886998Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-09-07T08:58:24.6887903Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-09-07T08:58:24.6888815Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-09-07T08:58:24.6889724Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-09-07T08:58:24.6890729Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-09-07T08:58:24.6891689Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-09-07T08:58:24.7006321Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-09-07T08:58:24.7007223Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-09-07T08:58:24.7008140Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-09-07T08:58:24.7009165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-09-07T08:58:24.7010154Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-09-07T08:58:24.7011084Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-09-07T08:58:24.7012020Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-09-07T08:58:24.7012983Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-09-07T08:58:24.7013950Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-09-07T08:58:24.7014912Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-09-07T08:58:24.7015863Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-09-07T08:58:24.7030719Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-09-07T08:58:24.7084406Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-09-07T08:58:24.7085425Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:58:24.7086303Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:58:24.7087088Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-09-07T08:58:24.7087965Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-09-07T08:58:24.7088674Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-09-07T08:58:24.7089440Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/InstallScripts.json 2025-09-07T08:58:24.7090191Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2025-09-07T08:58:24.7090865Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-09-07T08:58:24.7091489Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-09-07T08:58:24.7092122Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-09-07T08:58:24.7191246Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-09-07T08:58:24.7229976Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-09-07T08:58:24.7230493Z creating: build/lib/ 2025-09-07T08:58:24.7316288Z inflating: build/lib/libprotobuf-lite.a 2025-09-07T08:58:24.7764287Z inflating: build/lib/libprotobuf.a 2025-09-07T08:58:24.7773797Z inflating: build/lib/libpthreadpool.a 2025-09-07T08:58:24.7781617Z inflating: build/lib/libcpuinfo.a 2025-09-07T08:58:24.8278413Z inflating: build/lib/libprotoc.a 2025-09-07T08:58:24.8286066Z inflating: build/lib/libcpuinfo_internals.a 2025-09-07T08:58:24.8286872Z inflating: build/lib/libclog.a 2025-09-07T08:58:24.8288875Z inflating: build/lib/libnnpack_reference_layers.a 2025-09-07T08:58:24.8307552Z inflating: build/lib/libpytorch_qnnpack.a 2025-09-07T08:58:24.8487476Z inflating: build/lib/libmicrokernels-prod.a 2025-09-07T08:58:24.8505234Z inflating: build/lib/libnnpack.a 2025-09-07T08:58:24.9290115Z inflating: build/lib/libmicrokernels-all.a 2025-09-07T08:58:24.9366222Z inflating: build/lib/libbenchmark.a 2025-09-07T08:58:24.9366812Z inflating: build/lib/libbenchmark_main.a 2025-09-07T08:58:24.9434237Z inflating: build/lib/libgtest.a 2025-09-07T08:58:24.9451279Z inflating: build/lib/libgmock.a 2025-09-07T08:58:24.9451997Z inflating: build/lib/libgtest_main.a 2025-09-07T08:58:24.9452437Z inflating: build/lib/libgmock_main.a 2025-09-07T08:58:24.9542779Z inflating: build/lib/libXNNPACK.a 2025-09-07T08:58:24.9543412Z inflating: build/lib/libjitprofiling.a 2025-09-07T08:58:24.9551121Z inflating: build/lib/libittnotify.a 2025-09-07T08:58:24.9617085Z inflating: build/lib/libasmjit.a 2025-09-07T08:58:25.0999609Z inflating: build/lib/libfbgemm.a 2025-09-07T08:58:25.1030244Z inflating: build/lib/libtensorpipe_uv.a 2025-09-07T08:58:25.1586618Z inflating: build/lib/libtensorpipe.a 2025-09-07T08:58:25.1836204Z inflating: build/lib/libtensorpipe_cuda.a 2025-09-07T08:58:25.1968252Z inflating: build/lib/libgloo.a 2025-09-07T08:58:25.2015973Z inflating: build/lib/libonnx_proto.a 2025-09-07T08:58:25.2735716Z inflating: build/lib/libonnx.a 2025-09-07T08:58:25.3171714Z inflating: build/lib/libgloo_cuda.a 2025-09-07T08:58:25.3190351Z inflating: build/lib/libfmt.a 2025-09-07T08:58:26.3510564Z inflating: build/lib/libdnnl.a 2025-09-07T08:58:26.3974882Z inflating: build/lib/libkineto.a 2025-09-07T08:58:26.6501840Z inflating: build/lib/libc10.so 2025-09-07T08:58:26.6503114Z inflating: build/lib/libtorch_global_deps.so 2025-09-07T08:58:26.6504756Z inflating: build/lib/libcaffe2_nvrtc.so 2025-09-07T08:58:26.6570361Z inflating: build/lib/libc10_cuda.so 2025-09-07T08:58:29.7725472Z inflating: build/lib/libtorch_cpu.so 2025-09-07T08:58:29.8488095Z inflating: build/lib/libtorch_nvshmem.so 2025-09-07T08:58:31.9350023Z inflating: build/lib/libtorch_cuda.so 2025-09-07T08:58:31.9350723Z inflating: build/lib/libtorch.so 2025-09-07T08:58:31.9400685Z inflating: build/lib/libtorch_cuda_linalg.so 2025-09-07T08:58:31.9470692Z inflating: build/lib/libtorchbind_test.so 2025-09-07T08:58:31.9490197Z inflating: build/lib/libjitbackend_test.so 2025-09-07T08:58:31.9514693Z inflating: build/lib/libbackend_with_compiler.so 2025-09-07T08:58:31.9541788Z inflating: build/lib/libaoti_custom_ops.so 2025-09-07T08:58:31.9544195Z inflating: build/lib/libc10d_cuda_test.so 2025-09-07T08:58:31.9548032Z inflating: build/lib/libshm.so 2025-09-07T08:58:32.1787941Z inflating: build/lib/libtorch_python.so 2025-09-07T08:58:32.1823959Z inflating: build/lib/libnnapi_backend.so 2025-09-07T08:58:32.1824342Z creating: build/bin/ 2025-09-07T08:58:32.2272440Z inflating: build/bin/protoc-3.13.0.0 2025-09-07T08:58:32.2720879Z inflating: build/bin/protoc 2025-09-07T08:58:32.2776783Z inflating: build/bin/c10_DeviceGuard_test 2025-09-07T08:58:32.2833297Z inflating: build/bin/c10_AllocatorConfig_test 2025-09-07T08:58:32.2888592Z inflating: build/bin/c10_Device_test 2025-09-07T08:58:32.2951981Z inflating: build/bin/c10_DispatchKeySet_test 2025-09-07T08:58:32.3005658Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-09-07T08:58:32.3057699Z inflating: build/bin/c10_StreamGuard_test 2025-09-07T08:58:32.3119381Z inflating: build/bin/c10_SymInt_test 2025-09-07T08:58:32.3176844Z inflating: build/bin/c10_Scalar_test 2025-09-07T08:58:32.3234314Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-09-07T08:58:32.3293522Z inflating: build/bin/c10_InlineStreamGuard_test 2025-09-07T08:58:32.3353304Z inflating: build/bin/c10_SizesAndStrides_test 2025-09-07T08:58:32.3427414Z inflating: build/bin/c10_cow_test 2025-09-07T08:58:32.3481448Z inflating: build/bin/c10_ArrayRef_test 2025-09-07T08:58:32.3538016Z inflating: build/bin/c10_Bitset_test 2025-09-07T08:58:32.3590035Z inflating: build/bin/c10_ConstexprCrc_test 2025-09-07T08:58:32.3643310Z inflating: build/bin/c10_DeadlockDetection_test 2025-09-07T08:58:32.3704009Z inflating: build/bin/c10_Enumerate_test 2025-09-07T08:58:32.3757874Z inflating: build/bin/c10_Half_test 2025-09-07T08:58:32.3815108Z inflating: build/bin/c10_IntrusiveList_test 2025-09-07T08:58:32.3873602Z inflating: build/bin/c10_Metaprogramming_test 2025-09-07T08:58:32.3933554Z inflating: build/bin/c10_LeftRight_test 2025-09-07T08:58:32.3989896Z inflating: build/bin/c10_NetworkFlow_test 2025-09-07T08:58:32.4042990Z inflating: build/bin/c10_Semaphore_test 2025-09-07T08:58:32.4096461Z inflating: build/bin/c10_Synchronized_test 2025-09-07T08:58:32.4156402Z inflating: build/bin/c10_ThreadLocal_test 2025-09-07T08:58:32.4211130Z inflating: build/bin/c10_TypeList_test 2025-09-07T08:58:32.4263348Z inflating: build/bin/c10_TypeTraits_test 2025-09-07T08:58:32.4318445Z inflating: build/bin/c10_accumulate_test 2025-09-07T08:58:32.4377937Z inflating: build/bin/c10_bfloat16_test 2025-09-07T08:58:32.4438195Z inflating: build/bin/c10_complex_math_test 2025-09-07T08:58:32.4492715Z inflating: build/bin/c10_bit_cast_test 2025-09-07T08:58:32.4551278Z inflating: build/bin/c10_complex_test 2025-09-07T08:58:32.4604090Z inflating: build/bin/c10_error_test 2025-09-07T08:58:32.4660325Z inflating: build/bin/c10_exception_test 2025-09-07T08:58:32.4714135Z inflating: build/bin/c10_generic_math_test 2025-09-07T08:58:32.4768164Z inflating: build/bin/c10_flags_test 2025-09-07T08:58:32.4938273Z inflating: build/bin/c10_intrusive_ptr_test 2025-09-07T08:58:32.4992832Z inflating: build/bin/c10_irange_test 2025-09-07T08:58:32.5049761Z inflating: build/bin/c10_lazy_test 2025-09-07T08:58:32.5110591Z inflating: build/bin/c10_logging_test 2025-09-07T08:58:32.5168942Z inflating: build/bin/c10_registry_test 2025-09-07T08:58:32.5229064Z inflating: build/bin/c10_string_util_test 2025-09-07T08:58:32.5282982Z inflating: build/bin/c10_tempfile_test 2025-09-07T08:58:32.5338281Z inflating: build/bin/c10_TypeIndex_test 2025-09-07T08:58:32.5497830Z inflating: build/bin/c10_small_vector_test 2025-09-07T08:58:32.5546656Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-09-07T08:58:32.5612501Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-09-07T08:58:32.5691520Z inflating: build/bin/c10_optional_test 2025-09-07T08:58:32.5746665Z inflating: build/bin/c10_ssize_test 2025-09-07T08:58:32.5806944Z inflating: build/bin/c10_typeid_test 2025-09-07T08:58:32.5859155Z inflating: build/bin/c10_string_view_test 2025-09-07T08:58:32.5911555Z inflating: build/bin/c10_cuda_CUDATest 2025-09-07T08:58:32.6510261Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-09-07T08:58:32.7132218Z inflating: build/bin/vec_test_all_types_AVX2 2025-09-07T08:58:32.7743151Z inflating: build/bin/vec_test_all_types_AVX512 2025-09-07T08:58:32.7799449Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2025-09-07T08:58:32.7854882Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2025-09-07T08:58:32.7911313Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2025-09-07T08:58:32.7967283Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2025-09-07T08:58:32.8023321Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2025-09-07T08:58:32.8080062Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2025-09-07T08:58:32.8136090Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2025-09-07T08:58:32.8191931Z inflating: build/bin/BackoffTest 2025-09-07T08:58:32.8251806Z inflating: build/bin/TCPStoreTest 2025-09-07T08:58:32.8308094Z inflating: build/bin/FileStoreTest 2025-09-07T08:58:32.8365178Z inflating: build/bin/HashStoreTest 2025-09-07T08:58:32.8380376Z inflating: build/bin/ProcessGroupMPITest 2025-09-07T08:58:32.8383186Z inflating: build/bin/example_allreduce 2025-09-07T08:58:32.8438577Z inflating: build/bin/Dimname_test 2025-09-07T08:58:32.8499978Z inflating: build/bin/NamedTensor_test 2025-09-07T08:58:32.8568221Z inflating: build/bin/MaybeOwned_test 2025-09-07T08:58:32.8646196Z inflating: build/bin/Dict_test 2025-09-07T08:58:32.8709882Z inflating: build/bin/atest 2025-09-07T08:58:32.8772851Z inflating: build/bin/apply_utils_test 2025-09-07T08:58:32.8840051Z inflating: build/bin/basic 2025-09-07T08:58:32.8902322Z inflating: build/bin/cpu_generator_test 2025-09-07T08:58:32.8960814Z inflating: build/bin/broadcast_test 2025-09-07T08:58:32.9015589Z inflating: build/bin/cpu_allocator_test 2025-09-07T08:58:32.9072683Z inflating: build/bin/cpu_profiling_allocator_test 2025-09-07T08:58:32.9169698Z inflating: build/bin/cpu_rng_test 2025-09-07T08:58:32.9231289Z inflating: build/bin/extension_backend_test 2025-09-07T08:58:32.9286049Z inflating: build/bin/dlconvertor_test 2025-09-07T08:58:32.9344914Z inflating: build/bin/half_test 2025-09-07T08:58:32.9447289Z inflating: build/bin/ivalue_test 2025-09-07T08:58:32.9500776Z inflating: build/bin/lazy_tensor_test 2025-09-07T08:58:32.9558335Z inflating: build/bin/math_kernel_test 2025-09-07T08:58:32.9615910Z inflating: build/bin/memory_format_test 2025-09-07T08:58:32.9672693Z inflating: build/bin/memory_overlapping_test 2025-09-07T08:58:32.9729026Z inflating: build/bin/mobile_memory_cleanup 2025-09-07T08:58:32.9789586Z inflating: build/bin/native_test 2025-09-07T08:58:32.9844431Z inflating: build/bin/operators_test 2025-09-07T08:58:32.9898674Z inflating: build/bin/operator_name_test 2025-09-07T08:58:32.9953939Z inflating: build/bin/packedtensoraccessor_test 2025-09-07T08:58:33.0024570Z inflating: build/bin/pow_test 2025-09-07T08:58:33.0078039Z inflating: build/bin/reduce_ops_test 2025-09-07T08:58:33.0133663Z inflating: build/bin/reportMemoryUsage_test 2025-09-07T08:58:33.0194662Z inflating: build/bin/quantized_test 2025-09-07T08:58:33.0255331Z inflating: build/bin/scalar_tensor_test 2025-09-07T08:58:33.0317983Z inflating: build/bin/scalar_test 2025-09-07T08:58:33.0373141Z inflating: build/bin/StorageUtils_test 2025-09-07T08:58:33.0429038Z inflating: build/bin/stride_properties_test 2025-09-07T08:58:33.0513284Z inflating: build/bin/tensor_iterator_test 2025-09-07T08:58:33.0567151Z inflating: build/bin/thread_init_test 2025-09-07T08:58:33.0625152Z inflating: build/bin/test_parallel 2025-09-07T08:58:33.0684132Z inflating: build/bin/type_ptr_test 2025-09-07T08:58:33.0747370Z inflating: build/bin/type_test 2025-09-07T08:58:33.0804033Z inflating: build/bin/undefined_tensor_test 2025-09-07T08:58:33.0856991Z inflating: build/bin/verify_api_visibility 2025-09-07T08:58:33.0930925Z inflating: build/bin/legacy_vmap_test 2025-09-07T08:58:33.0985840Z inflating: build/bin/weakref_test 2025-09-07T08:58:33.1040585Z inflating: build/bin/wrapdim_test 2025-09-07T08:58:33.1095616Z inflating: build/bin/xla_tensor_test 2025-09-07T08:58:33.1159953Z inflating: build/bin/IListRef_test 2025-09-07T08:58:33.1270501Z inflating: build/bin/List_test 2025-09-07T08:58:33.1340544Z inflating: build/bin/KernelFunction_test 2025-09-07T08:58:33.1467748Z inflating: build/bin/kernel_function_legacy_test 2025-09-07T08:58:33.1568002Z inflating: build/bin/kernel_function_test 2025-09-07T08:58:33.1700127Z inflating: build/bin/kernel_lambda_legacy_test 2025-09-07T08:58:33.1808443Z inflating: build/bin/kernel_lambda_test 2025-09-07T08:58:33.1873524Z inflating: build/bin/kernel_stackbased_test 2025-09-07T08:58:33.1973654Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-09-07T08:58:33.2028524Z inflating: build/bin/CppSignature_test 2025-09-07T08:58:33.2087800Z inflating: build/bin/backend_fallback_test 2025-09-07T08:58:33.2140047Z inflating: build/bin/op_allowlist_test 2025-09-07T08:58:33.2445158Z inflating: build/bin/op_registration_test 2025-09-07T08:58:33.2514918Z inflating: build/bin/inline_container_test 2025-09-07T08:58:33.2571116Z inflating: build/bin/cuda_apply_test 2025-09-07T08:58:33.2634462Z inflating: build/bin/cuda_atomic_ops_test 2025-09-07T08:58:33.2690623Z inflating: build/bin/cuda_allocator_test 2025-09-07T08:58:33.2751207Z inflating: build/bin/cuda_caching_host_allocator_test 2025-09-07T08:58:33.2825048Z inflating: build/bin/cuda_complex_math_test 2025-09-07T08:58:33.2888075Z inflating: build/bin/cuda_complex_test 2025-09-07T08:58:33.2953952Z inflating: build/bin/cuda_cub_test 2025-09-07T08:58:33.3008034Z inflating: build/bin/cuda_device_test 2025-09-07T08:58:33.3076823Z inflating: build/bin/cuda_distributions_test 2025-09-07T08:58:33.3132449Z inflating: build/bin/cuda_dlconvertor_test 2025-09-07T08:58:33.3185912Z inflating: build/bin/cuda_exchange_device_test 2025-09-07T08:58:33.3246370Z inflating: build/bin/cuda_generator_test 2025-09-07T08:58:33.3299913Z inflating: build/bin/cuda_half_test 2025-09-07T08:58:33.3355692Z inflating: build/bin/cuda_integer_divider_test 2025-09-07T08:58:33.3409150Z inflating: build/bin/cuda_optional_test 2025-09-07T08:58:33.3464947Z inflating: build/bin/cuda_reportMemoryUsage_test 2025-09-07T08:58:33.3520339Z inflating: build/bin/cuda_packedtensoraccessor_test 2025-09-07T08:58:33.3573550Z inflating: build/bin/cuda_allocatorTraceTracker_test 2025-09-07T08:58:33.3638103Z inflating: build/bin/cuda_stream_test 2025-09-07T08:58:33.3694623Z inflating: build/bin/cuda_vectorized_test 2025-09-07T08:58:33.3747673Z inflating: build/bin/cuda_cudnn_test 2025-09-07T08:58:33.4135904Z inflating: build/bin/test_nativert 2025-09-07T08:58:33.4194392Z inflating: build/bin/test_dist_autograd 2025-09-07T08:58:33.4267329Z inflating: build/bin/test_cpp_rpc 2025-09-07T08:58:33.5468255Z inflating: build/bin/test_api 2025-09-07T08:58:33.5470568Z inflating: build/bin/parallel_benchmark 2025-09-07T08:58:33.5537948Z inflating: build/bin/ProcessGroupNCCLTest 2025-09-07T08:58:33.5604700Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2025-09-07T08:58:33.6726359Z inflating: build/bin/test_jit 2025-09-07T08:58:33.6795884Z inflating: build/bin/ProcessGroupGlooTest 2025-09-07T08:58:33.6858244Z inflating: build/bin/ProcessGroupGlooAsyncTest 2025-09-07T08:58:33.7212281Z inflating: build/bin/test_lazy 2025-09-07T08:58:33.7216081Z inflating: build/bin/torch_shm_manager 2025-09-07T08:58:33.7216669Z creating: .additional_ci_files/ 2025-09-07T08:58:33.7305950Z inflating: .additional_ci_files/test-times.json 2025-09-07T08:58:33.7645865Z inflating: .additional_ci_files/test-class-times.json 2025-09-07T08:58:33.7902553Z ##[group]Run rm artifacts.zip 2025-09-07T08:58:33.7902879Z rm artifacts.zip 2025-09-07T08:58:33.7915443Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:33.7915825Z env: 2025-09-07T08:58:33.7916050Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:33.7916394Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:33.7916885Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:33.7917272Z ##[endgroup] 2025-09-07T08:58:33.9659357Z ##[group]Run df -H 2025-09-07T08:58:33.9659627Z df -H 2025-09-07T08:58:33.9671162Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:33.9671560Z env: 2025-09-07T08:58:33.9671772Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:33.9672116Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:33.9672586Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:33.9673181Z ##[endgroup] 2025-09-07T08:58:33.9864554Z Filesystem Size Used Avail Use% Mounted on 2025-09-07T08:58:33.9864922Z overlay 5.3T 527G 4.7T 11% / 2025-09-07T08:58:33.9865246Z tmpfs 68M 0 68M 0% /dev 2025-09-07T08:58:33.9865580Z shm 68M 0 68M 0% /dev/shm 2025-09-07T08:58:33.9865962Z /dev/root 5.3T 527G 4.7T 11% /home/charlie/_work 2025-09-07T08:58:33.9866374Z tmpfs 121G 111k 121G 1% /run/docker.sock 2025-09-07T08:58:33.9866927Z tmpfs 603G 13k 603G 1% /proc/driver/nvidia 2025-09-07T08:58:33.9867408Z tmpfs 241G 1.7M 241G 1% /run/.ro3220624284/nvidia-persistenced/socket 2025-09-07T08:58:33.9867866Z tmpfs 603G 0 603G 0% /proc/acpi 2025-09-07T08:58:33.9868224Z tmpfs 603G 0 603G 0% /proc/scsi 2025-09-07T08:58:33.9868570Z tmpfs 603G 0 603G 0% /sys/firmware 2025-09-07T08:58:33.9901836Z Prepare all required actions 2025-09-07T08:58:33.9902394Z Getting action download info 2025-09-07T08:58:34.1402602Z ##[group]Run ./.github/actions/download-td-artifacts 2025-09-07T08:58:34.1403120Z with: 2025-09-07T08:58:34.1403322Z env: 2025-09-07T08:58:34.1403531Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:34.1403859Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:34.1404316Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:34.1404703Z ##[endgroup] 2025-09-07T08:58:34.1848949Z ##[group]Run seemethere/download-artifact-s3@v4 2025-09-07T08:58:34.1849293Z with: 2025-09-07T08:58:34.1849520Z name: td_results 2025-09-07T08:58:34.1849770Z s3-bucket: gha-artifacts 2025-09-07T08:58:34.1850047Z region: us-east-1 2025-09-07T08:58:34.1850268Z env: 2025-09-07T08:58:34.1850486Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:34.1850833Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:34.1851412Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:34.1851829Z ##[endgroup] 2025-09-07T08:58:34.6516712Z (node:11342) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-09-07T08:58:34.6517238Z 2025-09-07T08:58:34.6517432Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-09-07T08:58:34.6517989Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-09-07T08:58:34.6518565Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-09-07T08:58:34.7350572Z Found 0 objects with prefix pytorch/pytorch/17525309334/td_results/ 2025-09-07T08:58:34.7356446Z Artifact download has finished successfully 2025-09-07T08:58:34.7667625Z ##[group]Run mkdir -p .additional_ci_files 2025-09-07T08:58:34.7668024Z mkdir -p .additional_ci_files 2025-09-07T08:58:34.7668478Z mv td_results.json .additional_ci_files/td_results.json || true 2025-09-07T08:58:34.7681076Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:34.7681489Z env: 2025-09-07T08:58:34.7681701Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:34.7682041Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:34.7682511Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:34.7683089Z ##[endgroup] 2025-09-07T08:58:34.7883357Z mv: cannot stat 'td_results.json': No such file or directory 2025-09-07T08:58:34.8136678Z ##[group]Run .github/scripts/parse_ref.py 2025-09-07T08:58:34.8137065Z .github/scripts/parse_ref.py 2025-09-07T08:58:34.8148797Z shell: /usr/bin/bash -e {0} 2025-09-07T08:58:34.8149093Z env: 2025-09-07T08:58:34.8149304Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:34.8149652Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:34.8150136Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:34.8150542Z ##[endgroup] 2025-09-07T08:58:34.8515090Z Setting output branch=main 2025-09-07T08:58:34.8643597Z Prepare all required actions 2025-09-07T08:58:34.8644189Z Getting action download info 2025-09-07T08:58:35.0074324Z ##[group]Run ./.github/actions/filter-test-configs 2025-09-07T08:58:35.0074730Z with: 2025-09-07T08:58:35.0075148Z github-token: *** 2025-09-07T08:58:35.0080994Z test-matrix: {"include": [{"config": "inductor_huggingface_perf", "shard": 1, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 2, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 3, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 4, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 5, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 1, "num_shards": 2, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 2, "num_shards": 2, "runner": "linux.aws.a100"}]} 2025-09-07T08:58:35.0087391Z job-name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:58:35.0087894Z env: 2025-09-07T08:58:35.0088174Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:35.0088518Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:35.0088976Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:35.0089376Z ##[endgroup] 2025-09-07T08:58:35.0631859Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T08:58:35.0632175Z with: 2025-09-07T08:58:35.0632371Z shell: bash 2025-09-07T08:58:35.0632605Z timeout_minutes: 10 2025-09-07T08:58:35.0632855Z max_attempts: 5 2025-09-07T08:58:35.0633097Z retry_wait_seconds: 30 2025-09-07T08:58:35.0633916Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-09-07T08:58:35.0634789Z polling_interval_seconds: 1 2025-09-07T08:58:35.0635092Z warning_on_retry: true 2025-09-07T08:58:35.0635362Z continue_on_error: false 2025-09-07T08:58:35.0635610Z env: 2025-09-07T08:58:35.0635827Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:35.0636173Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:35.0636646Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:35.0637208Z GITHUB_TOKEN: *** 2025-09-07T08:58:35.0637438Z ##[endgroup] 2025-09-07T08:58:35.1407269Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-09-07T08:58:35.4445005Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T08:58:35.9368296Z Collecting requests==2.27.1 2025-09-07T08:58:35.9878486Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-09-07T08:58:36.3085470Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 166.3 kB/s eta 0:00:00 2025-09-07T08:58:36.6385193Z Collecting pyyaml==6.0.2 2025-09-07T08:58:36.6415208Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-09-07T08:58:37.0005651Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 2.1 MB/s eta 0:00:00 2025-09-07T08:58:37.0149106Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests==2.27.1) (3.3) 2025-09-07T08:58:37.0154960Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests==2.27.1) (1.26.5) 2025-09-07T08:58:37.4929528Z Collecting charset-normalizer~=2.0.0 2025-09-07T08:58:37.4961603Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-09-07T08:58:37.7800060Z Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests==2.27.1) (2020.6.20) 2025-09-07T08:58:37.8584677Z Installing collected packages: pyyaml, charset-normalizer, requests 2025-09-07T08:58:38.6219383Z WARNING: The script normalizer is installed in '/home/charlie/.local/bin' which is not on PATH. 2025-09-07T08:58:38.6220267Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T08:58:39.3562583Z Successfully installed charset-normalizer-2.0.12 pyyaml-6.0.2 requests-2.27.1 2025-09-07T08:58:40.1431879Z Command completed after 1 attempt(s). 2025-09-07T08:58:40.1712331Z ##[group]Run set -x 2025-09-07T08:58:40.1712639Z set -x 2025-09-07T08:58:40.1712893Z  2025-09-07T08:58:40.1713295Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-09-07T08:58:40.1729413Z # in runner workspace 2025-09-07T08:58:40.1729929Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-09-07T08:58:40.1743670Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:40.1744082Z env: 2025-09-07T08:58:40.1744313Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.1744657Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.1745126Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.1745545Z ##[endgroup] 2025-09-07T08:58:40.1910835Z + python3 /home/charlie/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-09-07T08:58:40.2072678Z Setting output branch=main 2025-09-07T08:58:40.2286738Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-09-07T08:58:40.2287218Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-09-07T08:58:40.2287578Z echo "Job name: ${JOB_NAME}" 2025-09-07T08:58:40.2287893Z  2025-09-07T08:58:40.2288289Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-09-07T08:58:40.2288793Z # in runner workspace 2025-09-07T08:58:40.2289223Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-09-07T08:58:40.2289721Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-09-07T08:58:40.2290082Z  --job-name "${JOB_NAME}" \ 2025-09-07T08:58:40.2296079Z  --test-matrix "{"include": [{"config": "inductor_huggingface_perf", "shard": 1, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 2, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 3, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 4, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 5, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 1, "num_shards": 2, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 2, "num_shards": 2, "runner": "linux.aws.a100"}]}" \ 2025-09-07T08:58:40.2303290Z  --selected-test-configs "" \ 2025-09-07T08:58:40.2303648Z  --pr-number "${PR_NUMBER}" \ 2025-09-07T08:58:40.2303970Z  --tag "${TAG}" \ 2025-09-07T08:58:40.2304282Z  --event-name "${EVENT_NAME}" \ 2025-09-07T08:58:40.2304623Z  --schedule "${SCHEDULE}" \ 2025-09-07T08:58:40.2304950Z  --branch "${HEAD_BRANCH}" 2025-09-07T08:58:40.2316836Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:40.2317220Z env: 2025-09-07T08:58:40.2317452Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.2317803Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.2318277Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.2318899Z GITHUB_TOKEN: *** 2025-09-07T08:58:40.2319335Z JOB_NAME: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:58:40.2319853Z PR_NUMBER: 2025-09-07T08:58:40.2320083Z TAG: 2025-09-07T08:58:40.2320307Z EVENT_NAME: schedule 2025-09-07T08:58:40.2320558Z SCHEDULE: 0 7 * * 0 2025-09-07T08:58:40.2320817Z HEAD_BRANCH: main 2025-09-07T08:58:40.2321071Z ##[endgroup] 2025-09-07T08:58:40.2500363Z Workflow: inductor-A100-perf-nightly 2025-09-07T08:58:40.2500942Z Job name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:58:40.3982943Z Setting output keep-going=True 2025-09-07T08:58:40.3983304Z Setting output ci-verbose-test-logs=False 2025-09-07T08:58:40.3983679Z Setting output ci-test-showlocals=False 2025-09-07T08:58:40.3984036Z Setting output ci-no-test-timeout=False 2025-09-07T08:58:40.3984384Z Setting output ci-no-td=False 2025-09-07T08:58:40.3984691Z Setting output ci-td-distributed=False 2025-09-07T08:58:40.3985027Z Setting output is-unstable=False 2025-09-07T08:58:40.3985349Z Setting output reenabled-issues= 2025-09-07T08:58:40.3991460Z Setting output test-matrix={"include": [{"config": "inductor_huggingface_perf", "shard": 1, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 2, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 3, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 4, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 5, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 1, "num_shards": 2, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 2, "num_shards": 2, "runner": "linux.aws.a100"}]} 2025-09-07T08:58:40.3997720Z Setting output is-test-matrix-empty=False 2025-09-07T08:58:40.4214675Z ##[group]Run echo "Filtered matrix:" 2025-09-07T08:58:40.4215033Z echo "Filtered matrix:" 2025-09-07T08:58:40.4220925Z echo "{"include": [{"config": "inductor_huggingface_perf", "shard": 1, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 2, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 3, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 4, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_huggingface_perf", "shard": 5, "num_shards": 5, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_timm_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 1, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 2, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 3, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 4, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 5, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "inductor_torchbench_perf", "shard": 6, "num_shards": 6, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 1, "num_shards": 2, "runner": "linux.aws.a100"}, {"config": "cachebench", "shard": 2, "num_shards": 2, "runner": "linux.aws.a100"}]}" 2025-09-07T08:58:40.4226909Z  2025-09-07T08:58:40.4227122Z echo 2025-09-07T08:58:40.4227408Z echo "Is the current job unstable? False" 2025-09-07T08:58:40.4227757Z  2025-09-07T08:58:40.4227958Z echo 2025-09-07T08:58:40.4228221Z echo "Is keep-going label set? True" 2025-09-07T08:58:40.4228569Z  2025-09-07T08:58:40.4228764Z echo 2025-09-07T08:58:40.4229006Z echo "Reenabled issues? " 2025-09-07T08:58:40.4240813Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:40.4241223Z env: 2025-09-07T08:58:40.4241438Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.4241783Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.4242251Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.4242653Z ##[endgroup] 2025-09-07T08:58:40.4461787Z Filtered matrix: 2025-09-07T08:58:40.4468656Z {include: [{config: inductor_huggingface_perf, shard: 1, num_shards: 5, runner: linux.aws.a100}, {config: inductor_huggingface_perf, shard: 2, num_shards: 5, runner: linux.aws.a100}, {config: inductor_huggingface_perf, shard: 3, num_shards: 5, runner: linux.aws.a100}, {config: inductor_huggingface_perf, shard: 4, num_shards: 5, runner: linux.aws.a100}, {config: inductor_huggingface_perf, shard: 5, num_shards: 5, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 1, num_shards: 6, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 2, num_shards: 6, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 3, num_shards: 6, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 4, num_shards: 6, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 5, num_shards: 6, runner: linux.aws.a100}, {config: inductor_timm_perf, shard: 6, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 1, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 2, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 3, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 4, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 5, num_shards: 6, runner: linux.aws.a100}, {config: inductor_torchbench_perf, shard: 6, num_shards: 6, runner: linux.aws.a100}, {config: cachebench, shard: 1, num_shards: 2, runner: linux.aws.a100}, {config: cachebench, shard: 2, num_shards: 2, runner: linux.aws.a100}]} 2025-09-07T08:58:40.4474558Z 2025-09-07T08:58:40.4474667Z Is the current job unstable? False 2025-09-07T08:58:40.4474894Z 2025-09-07T08:58:40.4475007Z Is keep-going label set? True 2025-09-07T08:58:40.4475201Z 2025-09-07T08:58:40.4475306Z Reenabled issues? 2025-09-07T08:58:40.4669587Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-09-07T08:58:40.4670128Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-09-07T08:58:40.4681656Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:40.4682040Z env: 2025-09-07T08:58:40.4682258Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.4682598Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.4683276Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.4683673Z JOB_TIMEOUT: 1440 2025-09-07T08:58:40.4683916Z ##[endgroup] 2025-09-07T08:58:40.5202624Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:58:40.5203433Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:58:40.5203913Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:58:40.5215405Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:58:40.5215800Z env: 2025-09-07T08:58:40.5216027Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.5216372Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.5216829Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.5217231Z ##[endgroup] 2025-09-07T08:58:40.5526566Z ##[group]Run set -x 2025-09-07T08:58:40.5526924Z set -x 2025-09-07T08:58:40.5527155Z  2025-09-07T08:58:40.5527403Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-09-07T08:58:40.5527815Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-09-07T08:58:40.5528226Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-09-07T08:58:40.5528604Z  TEST_COMMAND=.ci/onnx/test.sh 2025-09-07T08:58:40.5528906Z else 2025-09-07T08:58:40.5529177Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-09-07T08:58:40.5529495Z fi 2025-09-07T08:58:40.5529707Z  2025-09-07T08:58:40.5529968Z # Leaving 1GB for the runner and other things 2025-09-07T08:58:40.5530576Z TOTAL_AVAILABLE_MEMORY_IN_GB=$(awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo) 2025-09-07T08:58:40.5531487Z # https://docs.docker.com/engine/containers/resource_constraints/#--memory-swap-details, the 3GB swap 2025-09-07T08:58:40.5532222Z # comes from https://github.com/pytorch/test-infra/pull/6058 2025-09-07T08:58:40.5532758Z TOTAL_MEMORY_WITH_SWAP=$(("${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}" + 3)) 2025-09-07T08:58:40.5533195Z  2025-09-07T08:58:40.5533467Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-09-07T08:58:40.5533825Z  SHM_OPTS= 2025-09-07T08:58:40.5534074Z  JENKINS_USER= 2025-09-07T08:58:40.5534436Z  # ensure that docker container cleanly exits in 12 hours 2025-09-07T08:58:40.5535108Z  # if for some reason cleanup action doesn't stop container 2025-09-07T08:58:40.5535531Z  # when job is cancelled 2025-09-07T08:58:40.5535854Z  DOCKER_SHELL_CMD="sleep 12h" 2025-09-07T08:58:40.5536151Z else 2025-09-07T08:58:40.5536412Z  SHM_OPTS="--shm-size=${SHM_SIZE}" 2025-09-07T08:58:40.5536761Z  JENKINS_USER="--user jenkins" 2025-09-07T08:58:40.5537086Z  DOCKER_SHELL_CMD= 2025-09-07T08:58:40.5537440Z fi 2025-09-07T08:58:40.5537657Z  2025-09-07T08:58:40.5538006Z # detached container should get cleaned up by teardown_ec2_linux 2025-09-07T08:58:40.5538562Z # TODO: Stop building test binaries as part of the build phase 2025-09-07T08:58:40.5539184Z # Used for GPU_FLAG, SHM_OPTS, JENKINS_USER and DOCKER_SHELL_CMD since that doesn't play nice 2025-09-07T08:58:40.5539748Z # shellcheck disable=SC2086,SC2090 2025-09-07T08:58:40.5540098Z container_name=$(docker run \ 2025-09-07T08:58:40.5540421Z  ${GPU_FLAG:-} \ 2025-09-07T08:58:40.5540736Z  ${SCCACHE_SERVER_PORT_DOCKER_FLAG:-} \ 2025-09-07T08:58:40.5541086Z  -e BUILD_ENVIRONMENT \ 2025-09-07T08:58:40.5541397Z  -e PR_NUMBER \ 2025-09-07T08:58:40.5541686Z  -e GITHUB_ACTIONS \ 2025-09-07T08:58:40.5541992Z  -e GITHUB_REPOSITORY \ 2025-09-07T08:58:40.5542293Z  -e GITHUB_WORKFLOW \ 2025-09-07T08:58:40.5542590Z  -e GITHUB_JOB \ 2025-09-07T08:58:40.5542868Z  -e GITHUB_RUN_ID \ 2025-09-07T08:58:40.5543158Z  -e GITHUB_RUN_NUMBER \ 2025-09-07T08:58:40.5543455Z  -e GITHUB_RUN_ATTEMPT \ 2025-09-07T08:58:40.5543759Z  -e JOB_ID \ 2025-09-07T08:58:40.5544018Z  -e JOB_NAME \ 2025-09-07T08:58:40.5544285Z  -e BASE_SHA \ 2025-09-07T08:58:40.5544539Z  -e BRANCH \ 2025-09-07T08:58:40.5544795Z  -e SHA1 \ 2025-09-07T08:58:40.5545062Z  -e AWS_DEFAULT_REGION \ 2025-09-07T08:58:40.5545369Z  -e IN_WHEEL_TEST \ 2025-09-07T08:58:40.5545644Z  -e SHARD_NUMBER \ 2025-09-07T08:58:40.5545928Z  -e TEST_CONFIG \ 2025-09-07T08:58:40.5546213Z  -e NUM_TEST_SHARDS \ 2025-09-07T08:58:40.5546514Z  -e REENABLED_ISSUES \ 2025-09-07T08:58:40.5546815Z  -e CONTINUE_THROUGH_ERROR \ 2025-09-07T08:58:40.5547308Z  -e VERBOSE_TEST_LOGS \ 2025-09-07T08:58:40.5547627Z  -e TEST_SHOWLOCALS \ 2025-09-07T08:58:40.5547928Z  -e NO_TEST_TIMEOUT \ 2025-09-07T08:58:40.5548202Z  -e NO_TD \ 2025-09-07T08:58:40.5548469Z  -e TD_DISTRIBUTED \ 2025-09-07T08:58:40.5548761Z  -e PR_LABELS \ 2025-09-07T08:58:40.5549068Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-09-07T08:58:40.5549401Z  -e SCCACHE_BUCKET \ 2025-09-07T08:58:40.5549690Z  -e SCCACHE_REGION \ 2025-09-07T08:58:40.5549982Z  -e XLA_CUDA \ 2025-09-07T08:58:40.5550276Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2025-09-07T08:58:40.5550653Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-09-07T08:58:40.5551027Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-09-07T08:58:40.5551409Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2025-09-07T08:58:40.5551763Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-09-07T08:58:40.5552106Z  -e VLLM_TEST_HUGGING_FACE_TOKEN \ 2025-09-07T08:58:40.5552445Z  -e SCRIBE_GRAPHQL_ACCESS_TOKEN \ 2025-09-07T08:58:40.5552778Z  -e DASHBOARD_TAG \ 2025-09-07T08:58:40.5553073Z  -e ARTIFACTS_FILE_SUFFIX \ 2025-09-07T08:58:40.5553444Z  --memory="${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}g" \ 2025-09-07T08:58:40.5553864Z  --memory-swap="${TOTAL_MEMORY_WITH_SWAP}g" \ 2025-09-07T08:58:40.5554294Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-09-07T08:58:40.5554702Z  --security-opt seccomp=unconfined \ 2025-09-07T08:58:40.5555158Z  --cap-add=SYS_PTRACE \ 2025-09-07T08:58:40.5555462Z  --ipc=host \ 2025-09-07T08:58:40.5555715Z  ${SHM_OPTS} \ 2025-09-07T08:58:40.5555978Z  --tty \ 2025-09-07T08:58:40.5556222Z  --detach \ 2025-09-07T08:58:40.5556495Z  --name="${container_name}" \ 2025-09-07T08:58:40.5556806Z  ${JENKINS_USER} \ 2025-09-07T08:58:40.5557170Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-09-07T08:58:40.5557577Z  -w /var/lib/jenkins/workspace \ 2025-09-07T08:58:40.5557906Z  "${DOCKER_IMAGE}" \ 2025-09-07T08:58:40.5558181Z  ${DOCKER_SHELL_CMD} 2025-09-07T08:58:40.5558454Z ) 2025-09-07T08:58:40.5558752Z # Propagate download.pytorch.org IP to container 2025-09-07T08:58:40.5559448Z grep download.pytorch.org /etc/hosts | docker exec -i "${container_name}" sudo bash -c "/bin/cat >> /etc/hosts" 2025-09-07T08:58:40.5560195Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2025-09-07T08:58:40.5560609Z  2025-09-07T08:58:40.5560883Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-09-07T08:58:40.5561488Z  docker exec -t "${container_name}" sh -c "python3 -m pip install -r .ci/docker/requirements-ci.txt" 2025-09-07T08:58:40.5562032Z fi 2025-09-07T08:58:40.5562234Z  2025-09-07T08:58:40.5562747Z docker exec -t "${container_name}" sh -c "python3 -m pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2025-09-07T08:58:40.5574596Z shell: /usr/bin/bash -e {0} 2025-09-07T08:58:40.5574889Z env: 2025-09-07T08:58:40.5575110Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:58:40.5575438Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:58:40.5575909Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T08:58:40.5576406Z BUILD_ENVIRONMENT: linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T08:58:40.5576805Z PR_NUMBER: 2025-09-07T08:58:40.5577041Z GITHUB_REPOSITORY: pytorch/pytorch 2025-09-07T08:58:40.5577497Z GITHUB_WORKFLOW: inductor-A100-perf-nightly 2025-09-07T08:58:40.5577837Z GITHUB_JOB: test 2025-09-07T08:58:40.5578084Z GITHUB_RUN_ID: 17525309334 2025-09-07T08:58:40.5578348Z GITHUB_RUN_NUMBER: 5418 2025-09-07T08:58:40.5578620Z GITHUB_RUN_ATTEMPT: 1 2025-09-07T08:58:40.5578872Z JOB_ID: 49775768433 2025-09-07T08:58:40.5579497Z JOB_NAME: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T08:58:40.5580013Z BRANCH: main 2025-09-07T08:58:40.5580264Z SHA1: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:58:40.5580647Z BASE_SHA: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:58:40.5581019Z TEST_CONFIG: inductor_torchbench_perf 2025-09-07T08:58:40.5581336Z SHARD_NUMBER: 6 2025-09-07T08:58:40.5581558Z NUM_TEST_SHARDS: 6 2025-09-07T08:58:40.5581802Z REENABLED_ISSUES: 2025-09-07T08:58:40.5582064Z CONTINUE_THROUGH_ERROR: True 2025-09-07T08:58:40.5582353Z VERBOSE_TEST_LOGS: False 2025-09-07T08:58:40.5582615Z TEST_SHOWLOCALS: False 2025-09-07T08:58:40.5582880Z NO_TEST_TIMEOUT: False 2025-09-07T08:58:40.5583137Z NO_TD: False 2025-09-07T08:58:40.5583371Z TD_DISTRIBUTED: False 2025-09-07T08:58:40.5583679Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2025-09-07T08:58:40.5584054Z SCCACHE_REGION: us-east-1 2025-09-07T08:58:40.5584327Z SHM_SIZE: 2g 2025-09-07T08:58:40.5585255Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:58:40.5586230Z XLA_CUDA: 2025-09-07T08:58:40.5586595Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2025-09-07T08:58:40.5587062Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-09-07T08:58:40.5587396Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-09-07T08:58:40.5588723Z DASHBOARD_TAG: training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T08:58:40.5590389Z VLLM_TEST_HUGGING_FACE_TOKEN: *** 2025-09-07T08:58:40.5590802Z HUGGING_FACE_HUB_TOKEN: *** 2025-09-07T08:58:40.5591234Z SCRIBE_GRAPHQL_ACCESS_TOKEN: *** 2025-09-07T08:58:40.5591733Z ARTIFACTS_FILE_SUFFIX: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T08:58:40.5592243Z ##[endgroup] 2025-09-07T08:58:40.5748789Z + [[ inductor_torchbench_perf == \m\u\l\t\i\g\p\u ]] 2025-09-07T08:58:40.5749224Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *onnx* ]] 2025-09-07T08:58:40.5749606Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-09-07T08:58:40.5752095Z ++ awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo 2025-09-07T08:58:40.5760264Z + TOTAL_AVAILABLE_MEMORY_IN_GB='1120.797 ' 2025-09-07T08:58:40.5760626Z + TOTAL_MEMORY_WITH_SWAP=1123 2025-09-07T08:58:40.5761004Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *\s\3\9\0\x* ]] 2025-09-07T08:58:40.5761408Z + SHM_OPTS=--shm-size=2g 2025-09-07T08:58:40.5761688Z + JENKINS_USER='--user jenkins' 2025-09-07T08:58:40.5761959Z + DOCKER_SHELL_CMD= 2025-09-07T08:58:40.5769001Z +++ nproc --ignore=2 2025-09-07T08:58:40.5782717Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e SCCACHE_SERVER_PORT=5229 -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=10 -e SCCACHE_BUCKET -e SCCACHE_REGION -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e VLLM_TEST_HUGGING_FACE_TOKEN -e SCRIBE_GRAPHQL_ACCESS_TOKEN -e DASHBOARD_TAG -e ARTIFACTS_FILE_SUFFIX --memory=1120g --memory-swap=1123g --env-file=/tmp/github_env_17525309334 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/charlie/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T09:11:32.9436773Z + container_name=3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T09:11:32.9438952Z + grep download.pytorch.org /etc/hosts 2025-09-07T09:11:32.9440714Z + docker exec -i 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 sudo bash -c '/bin/cat >> /etc/hosts' 2025-09-07T09:11:33.0224428Z + echo DOCKER_CONTAINER_ID=3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T09:11:33.0228914Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *\s\3\9\0\x* ]] 2025-09-07T09:11:33.0232993Z ++ echo dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl 2025-09-07T09:11:33.0235353Z + docker exec -t 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 sh -c 'python3 -m pip install dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2025-09-07T09:11:33.4255132Z Processing ./dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl (from torch==2.9.0a0+git93fb23d) 2025-09-07T09:11:34.2546662Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.19.1) 2025-09-07T09:11:34.2550305Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (4.15.0) 2025-09-07T09:11:34.2554871Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.13.3) 2025-09-07T09:11:34.2559344Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (2.8.8) 2025-09-07T09:11:34.2563167Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.1.6) 2025-09-07T09:11:34.2567773Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (2025.3.0) 2025-09-07T09:11:34.2581337Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.3.0) 2025-09-07T09:11:34.2941731Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.22.4) 2025-09-07T09:11:34.2960862Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.3.0) 2025-09-07T09:11:34.2996866Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.0.2) 2025-09-07T09:11:35.2090746Z Installing collected packages: torch 2025-09-07T09:11:46.4189871Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. 2025-09-07T09:11:46.4190868Z dall-e 0.1 requires torchvision, which is not installed. 2025-09-07T09:11:46.4191338Z effdet 0.4.1 requires torchvision, which is not installed. 2025-09-07T09:11:46.4191877Z python-doctr 1.0.0 requires torchvision>=0.15.0, which is not installed. 2025-09-07T09:11:46.4192516Z pytorch-labs-segment-anything-fast 0.2 requires torchao, which is not installed. 2025-09-07T09:11:46.4193316Z pytorch-labs-segment-anything-fast 0.2 requires torchvision>=0.17.0.dev20231026, which is not installed. 2025-09-07T09:11:46.4194100Z timm 1.0.14 requires torchvision, which is not installed. 2025-09-07T09:11:46.4194979Z Successfully installed torch-2.9.0a0+git93fb23d 2025-09-07T09:11:46.5198343Z + export TERM=vt100 2025-09-07T09:11:46.5198619Z + TERM=vt100 2025-09-07T09:11:46.5200969Z ++ dirname .ci/pytorch/test.sh 2025-09-07T09:11:46.5209550Z + source .ci/pytorch/common.sh 2025-09-07T09:11:46.5213036Z +++ dirname .ci/pytorch/common.sh 2025-09-07T09:11:46.5220311Z ++ source .ci/pytorch/common_utils.sh 2025-09-07T09:11:46.5221402Z +++ declare -f -t trap_add 2025-09-07T09:11:46.5226335Z ++ set -ex -o pipefail 2025-09-07T09:11:46.5226767Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *rocm* ]] 2025-09-07T09:11:46.5227146Z ++ BUILD_TEST_LIBTORCH=0 2025-09-07T09:11:46.5229454Z ++ dirname .ci/pytorch/test.sh 2025-09-07T09:11:46.5236134Z + source .ci/pytorch/common-build.sh 2025-09-07T09:11:46.5237590Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *win-* ]] 2025-09-07T09:11:46.5244249Z ++++ dirname .ci/pytorch/common-build.sh 2025-09-07T09:11:46.5251306Z +++ cd .ci/pytorch 2025-09-07T09:11:46.5251805Z +++ pwd -P 2025-09-07T09:11:46.5253543Z ++ script_dir=/var/lib/jenkins/workspace/.ci/pytorch 2025-09-07T09:11:46.5254015Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *-pch* ]] 2025-09-07T09:11:46.5254369Z ++ which sccache 2025-09-07T09:11:46.5265549Z ++ [[ -z ossci-compiler-cache-circleci-v2 ]] 2025-09-07T09:11:46.5265933Z ++ sccache --stop-server 2025-09-07T09:11:46.5289020Z ++ true 2025-09-07T09:11:46.5289262Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-09-07T09:11:46.5297930Z ++ trap_add sccache_epilogue EXIT 2025-09-07T09:11:46.5298571Z ++ trap_add_cmd=sccache_epilogue 2025-09-07T09:11:46.5298856Z ++ shift 2025-09-07T09:11:46.5299084Z ++ for trap_add_name in "$@" 2025-09-07T09:11:46.5304507Z ++++ trap -p EXIT 2025-09-07T09:11:46.5306959Z +++ eval 'extract_trap_cmd ' 2025-09-07T09:11:46.5307243Z ++++ extract_trap_cmd 2025-09-07T09:11:46.5307500Z ++++ printf '%s\n' '' 2025-09-07T09:11:46.5307752Z +++ printf '%s\n' sccache_epilogue 2025-09-07T09:11:46.5309127Z ++ trap -- ' 2025-09-07T09:11:46.5309378Z sccache_epilogue' EXIT 2025-09-07T09:11:46.5309642Z ++ [[ -n 1 ]] 2025-09-07T09:11:46.5310029Z ++ echo 'Skipping sccache server initialization, setting environment variables' 2025-09-07T09:11:46.5310649Z Skipping sccache server initialization, setting environment variables 2025-09-07T09:11:46.5311112Z ++ export SCCACHE_IDLE_TIMEOUT=0 2025-09-07T09:11:46.5311418Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-09-07T09:11:46.5311770Z ++ export SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T09:11:46.5312230Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T09:11:46.5312664Z ++ export RUST_LOG=sccache::server=error 2025-09-07T09:11:46.5313000Z ++ RUST_LOG=sccache::server=error 2025-09-07T09:11:46.5313301Z ++ sccache --zero-stats 2025-09-07T09:11:46.6245424Z Statistics zeroed. 2025-09-07T09:11:46.6249089Z ++ which ccache 2025-09-07T09:11:46.6261865Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *rocm* ]] 2025-09-07T09:11:46.6262317Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *s390x* ]] 2025-09-07T09:11:46.6262739Z + [[ -d /var/lib/jenkins/workspace ]] 2025-09-07T09:11:46.6264756Z ++ stat -c %u /var/lib/jenkins/workspace 2025-09-07T09:11:46.6274870Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2025-09-07T09:11:46.6275214Z + trap_add cleanup_workspace EXIT 2025-09-07T09:11:46.6275545Z + trap_add_cmd=cleanup_workspace 2025-09-07T09:11:46.6275837Z + shift 2025-09-07T09:11:46.6276071Z + for trap_add_name in "$@" 2025-09-07T09:11:46.6282393Z +++ trap -p EXIT 2025-09-07T09:11:46.6284883Z ++ eval 'extract_trap_cmd trap -- '\'' 2025-09-07T09:11:46.6285229Z sccache_epilogue'\'' EXIT' 2025-09-07T09:11:46.6285521Z +++ extract_trap_cmd trap -- ' 2025-09-07T09:11:46.6285813Z sccache_epilogue' EXIT 2025-09-07T09:11:46.6286055Z +++ printf '%s\n' ' 2025-09-07T09:11:46.6286293Z sccache_epilogue' 2025-09-07T09:11:46.6286554Z ++ printf '%s\n' cleanup_workspace 2025-09-07T09:11:46.6287425Z + trap -- ' 2025-09-07T09:11:46.6288036Z sccache_epilogue 2025-09-07T09:11:46.6288442Z cleanup_workspace' EXIT 2025-09-07T09:11:46.6289332Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2025-09-07T09:12:04.8469088Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2025-09-07T09:12:04.8486944Z + echo 'Environment variables:' 2025-09-07T09:12:04.8487270Z Environment variables: 2025-09-07T09:12:04.8487510Z + env 2025-09-07T09:12:04.8496062Z GITHUB_WORKSPACE=/home/charlie/_work/pytorch/pytorch 2025-09-07T09:12:04.8496494Z CONTINUE_THROUGH_ERROR=True 2025-09-07T09:12:04.8496865Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T09:12:04.8499849Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-09-07T09:12:04.8500157Z HOSTNAME=3a8cbe934959 2025-09-07T09:12:04.8500689Z GITHUB_PATH=/home/charlie/_work/_temp/_runner_file_commands/add_path_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:04.8501271Z GITHUB_ACTION=__run_2 2025-09-07T09:12:04.8501542Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-09-07T09:12:04.8501838Z GITHUB_RUN_NUMBER=5418 2025-09-07T09:12:04.8502118Z TEST_CONFIG=inductor_torchbench_perf 2025-09-07T09:12:04.8502464Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-09-07T09:12:04.8502798Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-09-07T09:12:04.8503111Z SCCACHE_IDLE_TIMEOUT=0 2025-09-07T09:12:04.8503516Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-09-07T09:12:04.8503829Z GITHUB_TRIGGERING_ACTOR=desertfire 2025-09-07T09:12:04.8504134Z GITHUB_REF_TYPE=branch 2025-09-07T09:12:04.8504441Z BASE_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:04.8504802Z XLA_CUDA= 2025-09-07T09:12:04.8505036Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-09-07T09:12:04.8505797Z HUGGING_FACE_HUB_TOKEN=*** 2025-09-07T09:12:04.8506215Z *** 2025-09-07T09:12:04.8506426Z GITHUB_REPOSITORY_ID=65600975 2025-09-07T09:12:04.8506718Z GITHUB_ACTIONS=true 2025-09-07T09:12:04.8506977Z NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T09:12:04.8507328Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T09:12:04.8507717Z SHA1=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:04.8508154Z GITHUB_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:04.8508794Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-perf-test-nightly.yml@refs/heads/main 2025-09-07T09:12:04.8509389Z UCC_HOME=/usr 2025-09-07T09:12:04.8509625Z VERBOSE_TEST_LOGS=False 2025-09-07T09:12:04.8509894Z GITHUB_REF=refs/heads/main 2025-09-07T09:12:04.8510152Z SHARD_NUMBER=6 2025-09-07T09:12:04.8510394Z GITHUB_REF_PROTECTED=true 2025-09-07T09:12:04.8510664Z HOME=/var/lib/jenkins 2025-09-07T09:12:04.8510927Z SCCACHE_SERVER_PORT=5229 2025-09-07T09:12:04.8511210Z GITHUB_API_URL=https://api.github.com 2025-09-07T09:12:04.8511553Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-09-07T09:12:04.8511904Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-09-07T09:12:04.8512253Z USE_SYSTEM_NCCL=1 2025-09-07T09:12:04.8512474Z NUM_TEST_SHARDS=6 2025-09-07T09:12:04.8512703Z UCX_HOME=/usr 2025-09-07T09:12:04.8513208Z GITHUB_STATE=/home/charlie/_work/_temp/_runner_file_commands/save_state_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:04.8513996Z JOB_NAME=cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T09:12:04.8514744Z GITHUB_ENV=/home/charlie/_work/_temp/_runner_file_commands/set_env_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:04.8515438Z GITHUB_EVENT_PATH=/home/charlie/_work/_temp/_github_workflow/event.json 2025-09-07T09:12:04.8515892Z GITHUB_EVENT_NAME=schedule 2025-09-07T09:12:04.8517194Z DASHBOARD_TAG=training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T09:12:04.8518535Z GITHUB_RUN_ID=17525309334 2025-09-07T09:12:04.8518796Z INSTALLED_OPENBLAS= 2025-09-07T09:12:04.8519363Z GITHUB_STEP_SUMMARY=/home/charlie/_work/_temp/_runner_file_commands/step_summary_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:04.8519993Z GITHUB_ACTOR=desertfire 2025-09-07T09:12:04.8520251Z PR_NUMBER= 2025-09-07T09:12:04.8520458Z DESIRED_CUDA=12.8.1 2025-09-07T09:12:04.8520958Z GITHUB_RUN_ATTEMPT=1 2025-09-07T09:12:04.8521228Z ANACONDA_PYTHON_VERSION=3.10 2025-09-07T09:12:04.8521575Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-09-07T09:12:04.8521922Z TERM=vt100 2025-09-07T09:12:04.8522144Z INSTALLED_VISION=yes 2025-09-07T09:12:04.8522391Z BRANCH=main 2025-09-07T09:12:04.8522625Z SCCACHE_REGION=us-east-1 2025-09-07T09:12:04.8523182Z OPENSSL_ROOT_DIR=/opt/openssl 2025-09-07T09:12:04.8523484Z CUDA_PATH=/usr/local/cuda 2025-09-07T09:12:04.8523950Z GITHUB_ACTION_PATH=/home/charlie/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-09-07T09:12:04.8524486Z GITHUB_SERVER_URL=https://github.com 2025-09-07T09:12:04.8524838Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-09-07T09:12:04.8525191Z REENABLED_ISSUES= 2025-09-07T09:12:04.8525419Z DOCS= 2025-09-07T09:12:04.8525623Z SHLVL=1 2025-09-07T09:12:04.8525815Z MAX_JOBS=10 2025-09-07T09:12:04.8526036Z GITHUB_ACTOR_ID=3659962 2025-09-07T09:12:04.8526380Z GITHUB_WORKFLOW_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:04.8526765Z GITHUB_REF_NAME=main 2025-09-07T09:12:04.8527154Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-09-07T09:12:04.8527594Z GITHUB_JOB=test 2025-09-07T09:12:04.8527831Z NO_TEST_TIMEOUT=False 2025-09-07T09:12:04.8528073Z TD_DISTRIBUTED=False 2025-09-07T09:12:04.8528350Z GITHUB_REPOSITORY=pytorch/pytorch 2025-09-07T09:12:04.8528658Z GITHUB_RETENTION_DAYS=90 2025-09-07T09:12:04.8528919Z OPENSSL_DIR=/opt/openssl 2025-09-07T09:12:04.8529196Z GITHUB_ACTION_REPOSITORY= 2025-09-07T09:12:04.8530193Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T09:12:04.8531037Z GITHUB_BASE_REF= 2025-09-07T09:12:04.8531272Z INSTALLED_ACL= 2025-09-07T09:12:04.8531679Z ARTIFACTS_FILE_SUFFIX=test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T09:12:04.8532166Z CI=true 2025-09-07T09:12:04.8532395Z GITHUB_REPOSITORY_OWNER=pytorch 2025-09-07T09:12:04.8532736Z RUST_LOG=sccache::server=error 2025-09-07T09:12:04.8533005Z JOB_ID=49775768433 2025-09-07T09:12:04.8533238Z GITHUB_HEAD_REF= 2025-09-07T09:12:04.8533476Z GITHUB_ACTION_REF= 2025-09-07T09:12:04.8533759Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-09-07T09:12:04.8534120Z TEST_SHOWLOCALS=False 2025-09-07T09:12:04.8534417Z GITHUB_WORKFLOW=inductor-A100-perf-nightly 2025-09-07T09:12:04.8534763Z DEBIAN_FRONTEND=noninteractive 2025-09-07T09:12:04.8535330Z GITHUB_OUTPUT=/home/charlie/_work/_temp/_runner_file_commands/set_output_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:04.8535913Z NO_TD=False 2025-09-07T09:12:04.8536152Z SKIP_SCCACHE_INITIALIZATION=1 2025-09-07T09:12:04.8536471Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-09-07T09:12:04.8536790Z _=/usr/bin/env 2025-09-07T09:12:04.8537105Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-09-07T09:12:04.8789036Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-09-07T09:12:04.8789737Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-09-07T09:12:04.8790353Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-09-07T09:12:04.8790963Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-09-07T09:12:04.8791455Z + BUILD_DIR=build 2025-09-07T09:12:04.8791714Z + BUILD_RENAMED_DIR=build_renamed 2025-09-07T09:12:04.8792020Z + BUILD_BIN_DIR=build/bin 2025-09-07T09:12:04.8792285Z + SHARD_NUMBER=6 2025-09-07T09:12:04.8792523Z + NUM_TEST_SHARDS=6 2025-09-07T09:12:04.8792784Z + export TORCH_SERIALIZATION_DEBUG=1 2025-09-07T09:12:04.8793109Z + TORCH_SERIALIZATION_DEBUG=1 2025-09-07T09:12:04.8793382Z + export VALGRIND=ON 2025-09-07T09:12:04.8793626Z + VALGRIND=ON 2025-09-07T09:12:04.8793926Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *clang9* ]] 2025-09-07T09:12:04.8794364Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *xpu* ]] 2025-09-07T09:12:04.8794713Z + detect_cuda_arch 2025-09-07T09:12:05.1127418Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *cuda* ]] 2025-09-07T09:12:05.1127944Z + command -v nvidia-smi 2025-09-07T09:12:05.1128223Z /usr/bin/nvidia-smi 2025-09-07T09:12:05.1128548Z ++ nvidia-smi --query-gpu=compute_cap --format=csv 2025-09-07T09:12:05.1128900Z ++ tail -n 1 2025-09-07T09:12:05.1342120Z + TORCH_CUDA_ARCH_LIST=8.0 2025-09-07T09:12:05.1342430Z + export TORCH_CUDA_ARCH_LIST 2025-09-07T09:12:05.1342789Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *s390x* ]] 2025-09-07T09:12:05.1343141Z + [[ 0 == \1 ]] 2025-09-07T09:12:05.1343392Z + [[ True == \1 ]] 2025-09-07T09:12:05.1343698Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *bazel* ]] 2025-09-07T09:12:05.1347649Z ++ realpath build/custom_test_artifacts 2025-09-07T09:12:05.1357374Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2025-09-07T09:12:05.1357921Z + [[ -n '' ]] 2025-09-07T09:12:05.1358171Z + echo 'Environment variables' 2025-09-07T09:12:05.1358471Z Environment variables 2025-09-07T09:12:05.1358723Z + env 2025-09-07T09:12:05.1365549Z GITHUB_WORKSPACE=/home/charlie/_work/pytorch/pytorch 2025-09-07T09:12:05.1366000Z CONTINUE_THROUGH_ERROR=True 2025-09-07T09:12:05.1366380Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc9-sm80 2025-09-07T09:12:05.1367045Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-09-07T09:12:05.1367342Z HOSTNAME=3a8cbe934959 2025-09-07T09:12:05.1367879Z GITHUB_PATH=/home/charlie/_work/_temp/_runner_file_commands/add_path_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:05.1368469Z GITHUB_ACTION=__run_2 2025-09-07T09:12:05.3039495Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-09-07T09:12:05.3039814Z GITHUB_RUN_NUMBER=5418 2025-09-07T09:12:05.3040080Z TEST_CONFIG=inductor_torchbench_perf 2025-09-07T09:12:05.3040415Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-09-07T09:12:05.3040754Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-09-07T09:12:05.3041080Z SCCACHE_IDLE_TIMEOUT=0 2025-09-07T09:12:05.3041533Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-09-07T09:12:05.3041847Z GITHUB_TRIGGERING_ACTOR=desertfire 2025-09-07T09:12:05.3042173Z GITHUB_REF_TYPE=branch 2025-09-07T09:12:05.3042437Z TORCH_CUDA_ARCH_LIST=8.0 2025-09-07T09:12:05.3042735Z BASE_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:05.3043312Z XLA_CUDA= 2025-09-07T09:12:05.3043545Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-09-07T09:12:05.3043984Z HUGGING_FACE_HUB_TOKEN=*** 2025-09-07T09:12:05.3044389Z *** 2025-09-07T09:12:05.3044600Z GITHUB_REPOSITORY_ID=65600975 2025-09-07T09:12:05.3044890Z GITHUB_ACTIONS=true 2025-09-07T09:12:05.3045151Z NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T09:12:05.3045507Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T09:12:05.3045897Z SHA1=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:05.3046281Z GITHUB_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:05.3046929Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-perf-test-nightly.yml@refs/heads/main 2025-09-07T09:12:05.3047523Z UCC_HOME=/usr 2025-09-07T09:12:05.3047753Z TORCH_SERIALIZATION_DEBUG=1 2025-09-07T09:12:05.3048041Z VERBOSE_TEST_LOGS=False 2025-09-07T09:12:05.3048310Z GITHUB_REF=refs/heads/main 2025-09-07T09:12:05.3048579Z SHARD_NUMBER=6 2025-09-07T09:12:05.3048806Z GITHUB_REF_PROTECTED=true 2025-09-07T09:12:05.3049076Z HOME=/var/lib/jenkins 2025-09-07T09:12:05.3049333Z SCCACHE_SERVER_PORT=5229 2025-09-07T09:12:05.3049635Z GITHUB_API_URL=https://api.github.com 2025-09-07T09:12:05.3049964Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-09-07T09:12:05.3050319Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-09-07T09:12:05.3050673Z USE_SYSTEM_NCCL=1 2025-09-07T09:12:05.3050909Z NUM_TEST_SHARDS=6 2025-09-07T09:12:05.3051132Z UCX_HOME=/usr 2025-09-07T09:12:05.3051642Z GITHUB_STATE=/home/charlie/_work/_temp/_runner_file_commands/save_state_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:05.3052425Z JOB_NAME=cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T09:12:05.3053455Z GITHUB_ENV=/home/charlie/_work/_temp/_runner_file_commands/set_env_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:05.3054146Z GITHUB_EVENT_PATH=/home/charlie/_work/_temp/_github_workflow/event.json 2025-09-07T09:12:05.3054600Z GITHUB_EVENT_NAME=schedule 2025-09-07T09:12:05.3055904Z DASHBOARD_TAG=training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T09:12:05.3057336Z GITHUB_RUN_ID=17525309334 2025-09-07T09:12:05.3057610Z INSTALLED_OPENBLAS= 2025-09-07T09:12:05.3058169Z GITHUB_STEP_SUMMARY=/home/charlie/_work/_temp/_runner_file_commands/step_summary_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:05.3058804Z GITHUB_ACTOR=desertfire 2025-09-07T09:12:05.3059061Z PR_NUMBER= 2025-09-07T09:12:05.3059280Z DESIRED_CUDA=12.8.1 2025-09-07T09:12:05.3059511Z GITHUB_RUN_ATTEMPT=1 2025-09-07T09:12:05.3059755Z VALGRIND=ON 2025-09-07T09:12:05.3059985Z ANACONDA_PYTHON_VERSION=3.10 2025-09-07T09:12:05.3060333Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-09-07T09:12:05.3060675Z TERM=vt100 2025-09-07T09:12:05.3060927Z INSTALLED_VISION=yes 2025-09-07T09:12:05.3061172Z BRANCH=main 2025-09-07T09:12:05.3061403Z SCCACHE_REGION=us-east-1 2025-09-07T09:12:05.3061668Z OPENSSL_ROOT_DIR=/opt/openssl 2025-09-07T09:12:05.3061963Z CUDA_PATH=/usr/local/cuda 2025-09-07T09:12:05.3062426Z GITHUB_ACTION_PATH=/home/charlie/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-09-07T09:12:05.3062960Z GITHUB_SERVER_URL=https://github.com 2025-09-07T09:12:05.3063518Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-09-07T09:12:05.3063877Z REENABLED_ISSUES= 2025-09-07T09:12:05.3064112Z DOCS= 2025-09-07T09:12:05.3064314Z SHLVL=1 2025-09-07T09:12:05.3064504Z MAX_JOBS=10 2025-09-07T09:12:05.3064725Z GITHUB_ACTOR_ID=3659962 2025-09-07T09:12:05.3065071Z GITHUB_WORKFLOW_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T09:12:05.3065465Z GITHUB_REF_NAME=main 2025-09-07T09:12:05.3065848Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-09-07T09:12:05.3066288Z GITHUB_JOB=test 2025-09-07T09:12:05.3066522Z NO_TEST_TIMEOUT=False 2025-09-07T09:12:05.3066782Z TD_DISTRIBUTED=False 2025-09-07T09:12:05.3067047Z GITHUB_REPOSITORY=pytorch/pytorch 2025-09-07T09:12:05.3067359Z GITHUB_RETENTION_DAYS=90 2025-09-07T09:12:05.3067631Z OPENSSL_DIR=/opt/openssl 2025-09-07T09:12:05.3067910Z GITHUB_ACTION_REPOSITORY= 2025-09-07T09:12:05.3068707Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T09:12:05.3069555Z GITHUB_BASE_REF= 2025-09-07T09:12:05.3069790Z INSTALLED_ACL= 2025-09-07T09:12:05.3070212Z ARTIFACTS_FILE_SUFFIX=test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T09:12:05.3070684Z CI=true 2025-09-07T09:12:05.3070910Z GITHUB_REPOSITORY_OWNER=pytorch 2025-09-07T09:12:05.3071260Z RUST_LOG=sccache::server=error 2025-09-07T09:12:05.3071538Z JOB_ID=49775768433 2025-09-07T09:12:05.3071765Z GITHUB_HEAD_REF= 2025-09-07T09:12:05.3072000Z GITHUB_ACTION_REF= 2025-09-07T09:12:05.3072297Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-09-07T09:12:05.3072662Z TEST_SHOWLOCALS=False 2025-09-07T09:12:05.3072940Z GITHUB_WORKFLOW=inductor-A100-perf-nightly 2025-09-07T09:12:05.3073291Z DEBIAN_FRONTEND=noninteractive 2025-09-07T09:12:05.3073861Z GITHUB_OUTPUT=/home/charlie/_work/_temp/_runner_file_commands/set_output_d23aa7da-8d94-4532-be5c-69937ca2d4a1 2025-09-07T09:12:05.3074453Z NO_TD=False 2025-09-07T09:12:05.3074679Z SKIP_SCCACHE_INITIALIZATION=1 2025-09-07T09:12:05.3074993Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-09-07T09:12:05.3075314Z _=/usr/bin/env 2025-09-07T09:12:05.3075548Z + echo 'Testing pytorch' 2025-09-07T09:12:05.3075797Z Testing pytorch 2025-09-07T09:12:05.3076043Z + export LANG=C.UTF-8 2025-09-07T09:12:05.3076306Z + LANG=C.UTF-8 2025-09-07T09:12:05.3076517Z + PR_NUMBER= 2025-09-07T09:12:05.3076787Z + [[ inductor_torchbench_perf == \d\e\f\a\u\l\t ]] 2025-09-07T09:12:05.3077336Z + [[ inductor_torchbench_perf == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-09-07T09:12:05.3077747Z + [[ inductor_torchbench_perf == \s\l\o\w ]] 2025-09-07T09:12:05.3078191Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *slow-gradcheck* ]] 2025-09-07T09:12:05.3078657Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *cuda* ]] 2025-09-07T09:12:05.3079064Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-09-07T09:12:05.3079425Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-09-07T09:12:05.3079768Z + [[ inductor_torchbench_perf == *crossref* ]] 2025-09-07T09:12:05.3080156Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *rocm* ]] 2025-09-07T09:12:05.3080582Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *xpu* ]] 2025-09-07T09:12:05.3081014Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *-bazel-* ]] 2025-09-07T09:12:05.3081402Z + pip_install ninja==1.10.2 2025-09-07T09:12:05.3081766Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-09-07T09:12:05.3082245Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-09-07T09:12:05.7438294Z Collecting ninja==1.10.2 2025-09-07T09:12:06.0599605Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-09-07T09:12:06.3136578Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-09-07T09:12:07.6820263Z Installing collected packages: ninja 2025-09-07T09:12:07.6820684Z Attempting uninstall: ninja 2025-09-07T09:12:07.6830019Z Found existing installation: ninja 1.11.1.3 2025-09-07T09:12:07.6850934Z Uninstalling ninja-1.11.1.3: 2025-09-07T09:12:07.9178123Z Successfully uninstalled ninja-1.11.1.3 2025-09-07T09:12:08.3482379Z Successfully installed ninja-1.10.2 2025-09-07T09:12:08.4501554Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T09:12:08.4503249Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T09:12:08.4504301Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *aarch64* ]] 2025-09-07T09:12:08.4504764Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *asan* ]] 2025-09-07T09:12:08.4505186Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *-debug* ]] 2025-09-07T09:12:08.4505633Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 != *-bazel-* ]] 2025-09-07T09:12:08.4506251Z + echo 'We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc9-sm80. Expect the assertion to pass' 2025-09-07T09:12:08.4507021Z We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc9-sm80. Expect the assertion to pass 2025-09-07T09:12:08.4507525Z + cd test 2025-09-07T09:12:08.4507885Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-09-07T09:12:09.0416441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:09.0417992Z import pynvml # type: ignore[import] 2025-09-07T09:12:10.2230715Z + [[ inductor_torchbench_perf == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-09-07T09:12:10.2231238Z + [[ inductor_torchbench_perf == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-09-07T09:12:10.2231744Z + [[ inductor_torchbench_perf == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-09-07T09:12:10.2234762Z + DYNAMO_BENCHMARK_FLAGS=() 2025-09-07T09:12:10.2235275Z + [[ inductor_torchbench_perf == *pr_time_benchmarks* ]] 2025-09-07T09:12:10.2235687Z + [[ inductor_torchbench_perf == *dynamo_eager* ]] 2025-09-07T09:12:10.2236074Z + [[ inductor_torchbench_perf == *aot_eager* ]] 2025-09-07T09:12:10.2236437Z + [[ inductor_torchbench_perf == *aot_inductor* ]] 2025-09-07T09:12:10.2236873Z + [[ inductor_torchbench_perf == *max_autotune_inductor* ]] 2025-09-07T09:12:10.2237678Z + [[ inductor_torchbench_perf == *inductor* ]] 2025-09-07T09:12:10.2238054Z + [[ inductor_torchbench_perf != *perf* ]] 2025-09-07T09:12:10.2238394Z + [[ inductor_torchbench_perf == *dynamic* ]] 2025-09-07T09:12:10.2238747Z + [[ inductor_torchbench_perf == *cpu* ]] 2025-09-07T09:12:10.2239115Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-09-07T09:12:10.2258685Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *libtorch* ]] 2025-09-07T09:12:10.2259161Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *-bazel-* ]] 2025-09-07T09:12:10.2262289Z + cd test 2025-09-07T09:12:10.2263388Z + python -c 'import torch; print(torch.__config__.show())' 2025-09-07T09:12:10.8306288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:10.8307730Z import pynvml # type: ignore[import] 2025-09-07T09:12:12.3720856Z PyTorch built with: 2025-09-07T09:12:12.3721178Z - GCC 9.5 2025-09-07T09:12:12.3721405Z - C++ Version: 201703 2025-09-07T09:12:12.3721995Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-09-07T09:12:12.3722733Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-09-07T09:12:12.3723409Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-09-07T09:12:12.3723771Z - LAPACK is enabled (usually provided by MKL) 2025-09-07T09:12:12.3724472Z - NNPACK is enabled 2025-09-07T09:12:12.3724736Z - CPU capability usage: AVX512 2025-09-07T09:12:12.3725039Z - CUDA Runtime 12.8 2025-09-07T09:12:12.3725404Z - NVCC architecture flags: -gencode;arch=compute_80,code=sm_80 2025-09-07T09:12:12.3725809Z - CuDNN 90.8 2025-09-07T09:12:12.3730618Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=93fb23d6fae7c4e82c4239a1033e522088742634, CUDA_VERSION=12.8, CUDNN_VERSION=9.8.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Werror -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-09-07T09:12:12.3735617Z 2025-09-07T09:12:14.0647504Z + cd test 2025-09-07T09:12:14.0647912Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-09-07T09:12:14.6422956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:14.6424376Z import pynvml # type: ignore[import] 2025-09-07T09:12:15.6268924Z ATen/Parallel: 2025-09-07T09:12:15.6269271Z at::get_num_threads() : 12 2025-09-07T09:12:15.6269591Z at::get_num_interop_threads() : 48 2025-09-07T09:12:15.6269901Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-09-07T09:12:15.6270205Z omp_get_max_threads() : 12 2025-09-07T09:12:15.6270793Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-09-07T09:12:15.6271759Z mkl_get_max_threads() : 12 2025-09-07T09:12:15.6272175Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-09-07T09:12:15.6272629Z std::thread::hardware_concurrency() : 96 2025-09-07T09:12:15.6272966Z Environment variables: 2025-09-07T09:12:15.6273238Z OMP_NUM_THREADS : [not set] 2025-09-07T09:12:15.6273523Z MKL_NUM_THREADS : [not set] 2025-09-07T09:12:15.6273807Z ATen parallel backend: OpenMP 2025-09-07T09:12:15.6274011Z 2025-09-07T09:12:15.9795788Z + [[ inductor_torchbench_perf == *numpy_2* ]] 2025-09-07T09:12:15.9796285Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *aarch64* ]] 2025-09-07T09:12:15.9796706Z + [[ inductor_torchbench_perf == *backward* ]] 2025-09-07T09:12:15.9797062Z + [[ inductor_torchbench_perf == *xla* ]] 2025-09-07T09:12:15.9797406Z + [[ inductor_torchbench_perf == *vllm* ]] 2025-09-07T09:12:15.9797752Z + [[ inductor_torchbench_perf == *executorch* ]] 2025-09-07T09:12:15.9798149Z + [[ inductor_torchbench_perf == \j\i\t\_\l\e\g\a\c\y ]] 2025-09-07T09:12:15.9798598Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *libtorch* ]] 2025-09-07T09:12:15.9799022Z + [[ inductor_torchbench_perf == distributed ]] 2025-09-07T09:12:15.9799414Z + [[ inductor_torchbench_perf == *operator_benchmark* ]] 2025-09-07T09:12:15.9799852Z + [[ inductor_torchbench_perf == *inductor_distributed* ]] 2025-09-07T09:12:15.9800290Z + [[ inductor_torchbench_perf == *inductor-halide* ]] 2025-09-07T09:12:15.9800722Z + [[ inductor_torchbench_perf == *inductor-triton-cpu* ]] 2025-09-07T09:12:15.9801178Z + [[ inductor_torchbench_perf == *inductor-micro-benchmark* ]] 2025-09-07T09:12:15.9801962Z + [[ inductor_torchbench_perf == *huggingface* ]] 2025-09-07T09:12:15.9802332Z + [[ inductor_torchbench_perf == *timm* ]] 2025-09-07T09:12:15.9802687Z + [[ inductor_torchbench_perf == cachebench ]] 2025-09-07T09:12:15.9803294Z + [[ inductor_torchbench_perf == verify_cachebench ]] 2025-09-07T09:12:15.9803693Z + [[ inductor_torchbench_perf == *torchbench* ]] 2025-09-07T09:12:15.9804035Z + install_torchaudio 2025-09-07T09:12:15.9804293Z + local commit 2025-09-07T09:12:15.9804551Z ++ get_pinned_commit audio 2025-09-07T09:12:15.9804829Z ++ cat .github/ci_commit_pins/audio.txt 2025-09-07T09:12:15.9813386Z + commit=2e300559e4e123928a22187b8f59a5b56f57ddc8 2025-09-07T09:12:15.9814120Z + pip_build_and_install git+https://github.com/pytorch/audio.git@2e300559e4e123928a22187b8f59a5b56f57ddc8 dist/audio 2025-09-07T09:12:15.9815017Z + local build_target=git+https://github.com/pytorch/audio.git@2e300559e4e123928a22187b8f59a5b56f57ddc8 2025-09-07T09:12:15.9815595Z + local wheel_dir=dist/audio 2025-09-07T09:12:15.9815877Z + local found_whl=0 2025-09-07T09:12:15.9816142Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:15.9816598Z + [[ -f dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl ]] 2025-09-07T09:12:15.9817049Z + found_whl=1 2025-09-07T09:12:15.9817331Z + break 2025-09-07T09:12:15.9817538Z + '[' 1 == 0 ']' 2025-09-07T09:12:15.9817781Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:15.9818276Z + pip_install_whl dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:15.9818926Z + args=('dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl') 2025-09-07T09:12:15.9819380Z + local args 2025-09-07T09:12:15.9819771Z + [[ dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-09-07T09:12:15.9820258Z + for path in "${args[@]}" 2025-09-07T09:12:15.9820711Z + echo 'Installing dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl' 2025-09-07T09:12:15.9821391Z Installing dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:15.9822174Z + python3 -mpip install --no-index --no-deps dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:16.3708386Z Processing ./dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:16.4269336Z Installing collected packages: torchaudio 2025-09-07T09:12:17.0370291Z Successfully installed torchaudio-2.8.0a0+2e30055 2025-09-07T09:12:17.1328038Z + install_torchvision 2025-09-07T09:12:17.1328336Z + local orig_preload 2025-09-07T09:12:17.1328589Z + local commit 2025-09-07T09:12:17.1331349Z ++ get_pinned_commit vision 2025-09-07T09:12:17.1331651Z ++ cat .github/ci_commit_pins/vision.txt 2025-09-07T09:12:17.1343828Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-09-07T09:12:17.1344233Z + orig_preload= 2025-09-07T09:12:17.1344481Z + '[' -n '' ']' 2025-09-07T09:12:17.1344776Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *cuda* ]] 2025-09-07T09:12:17.1345182Z + export FORCE_CUDA=1 2025-09-07T09:12:17.1345439Z + FORCE_CUDA=1 2025-09-07T09:12:17.1345694Z + export WITH_CUDA=1 2025-09-07T09:12:17.1345933Z + WITH_CUDA=1 2025-09-07T09:12:17.1346528Z + pip_build_and_install git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 dist/vision 2025-09-07T09:12:17.1347459Z + local build_target=git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-09-07T09:12:17.1348064Z + local wheel_dir=dist/vision 2025-09-07T09:12:17.1348355Z + local found_whl=0 2025-09-07T09:12:17.1348600Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:17.1349076Z + [[ -f dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl ]] 2025-09-07T09:12:17.1349548Z + found_whl=1 2025-09-07T09:12:17.1349767Z + break 2025-09-07T09:12:17.1349958Z + '[' 1 == 0 ']' 2025-09-07T09:12:17.1350204Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:17.1350714Z + pip_install_whl dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:17.1351719Z + args=('dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl') 2025-09-07T09:12:17.1352185Z + local args 2025-09-07T09:12:17.1352600Z + [[ dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-09-07T09:12:17.1353112Z + for path in "${args[@]}" 2025-09-07T09:12:17.1353605Z + echo 'Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl' 2025-09-07T09:12:17.1354316Z Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:17.1355132Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:17.4901105Z Processing ./dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:17.5023470Z Installing collected packages: torchvision 2025-09-07T09:12:18.5509117Z Successfully installed torchvision-0.22.0a0+966da7e 2025-09-07T09:12:18.6324849Z + '[' -n '' ']' 2025-09-07T09:12:18.6325142Z + id=5 2025-09-07T09:12:18.6325391Z + pip_install opencv-python==4.8.0.74 2025-09-07T09:12:18.6325815Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-09-07T09:12:18.6326323Z + python3 -m pip install --progress-bar off opencv-python==4.8.0.74 2025-09-07T09:12:19.2894931Z Collecting opencv-python==4.8.0.74 2025-09-07T09:12:19.3239843Z Downloading opencv_python-4.8.0.74-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB) 2025-09-07T09:12:19.5456600Z Requirement already satisfied: numpy>=1.21.2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opencv-python==4.8.0.74) (1.22.4) 2025-09-07T09:12:19.5570832Z Downloading opencv_python-4.8.0.74-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (61.7 MB) 2025-09-07T09:12:23.2431145Z Installing collected packages: opencv-python 2025-09-07T09:12:23.2431575Z Attempting uninstall: opencv-python 2025-09-07T09:12:23.2444523Z Found existing installation: opencv-python 4.11.0.86 2025-09-07T09:12:23.2517073Z Uninstalling opencv-python-4.11.0.86: 2025-09-07T09:12:23.4084491Z Successfully uninstalled opencv-python-4.11.0.86 2025-09-07T09:12:24.8235910Z Successfully installed opencv-python-4.8.0.74 2025-09-07T09:12:24.9247621Z + [[ inductor_torchbench_perf == *inductor_torchbench_smoketest_perf* ]] 2025-09-07T09:12:24.9248207Z + [[ inductor_torchbench_perf == *inductor_torchbench_cpu_smoketest_perf* ]] 2025-09-07T09:12:24.9249186Z + [[ inductor_torchbench_perf == *torchbench_gcp_smoketest* ]] 2025-09-07T09:12:24.9249623Z + [[ inductor_torchbench_perf != *cpu* ]] 2025-09-07T09:12:24.9249966Z + install_torchrec_and_fbgemm 2025-09-07T09:12:24.9250244Z + local torchrec_commit 2025-09-07T09:12:24.9251864Z ++ get_pinned_commit torchrec 2025-09-07T09:12:24.9252189Z ++ cat .github/ci_commit_pins/torchrec.txt 2025-09-07T09:12:24.9264386Z + torchrec_commit=6cd9fd362514d14ebb9ed51314c62ac1e1e2bbf2 2025-09-07T09:12:24.9264800Z + local fbgemm_commit 2025-09-07T09:12:24.9268112Z ++ get_pinned_commit fbgemm 2025-09-07T09:12:24.9268873Z ++ cat .github/ci_commit_pins/fbgemm.txt 2025-09-07T09:12:24.9279779Z + fbgemm_commit=de731af65b4f04696e85c729e3282450b51b95fd 2025-09-07T09:12:24.9280547Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *rocm* ]] 2025-09-07T09:12:24.9281208Z + pip_uninstall torchrec-nightly 2025-09-07T09:12:24.9281759Z + pip3 uninstall -y torchrec-nightly 2025-09-07T09:12:25.3806167Z WARNING: Skipping torchrec-nightly as it is not installed. 2025-09-07T09:12:25.4173852Z + pip_uninstall fbgemm-gpu-nightly 2025-09-07T09:12:25.4174222Z + pip3 uninstall -y fbgemm-gpu-nightly 2025-09-07T09:12:25.7767534Z WARNING: Skipping fbgemm-gpu-nightly as it is not installed. 2025-09-07T09:12:25.8156247Z + pip_install setuptools-git-versioning scikit-build pyre-extensions 2025-09-07T09:12:25.8156856Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-09-07T09:12:25.8157515Z + python3 -m pip install --progress-bar off setuptools-git-versioning scikit-build pyre-extensions 2025-09-07T09:12:26.1705096Z Requirement already satisfied: setuptools-git-versioning in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (2.1.0) 2025-09-07T09:12:26.1706698Z Requirement already satisfied: scikit-build in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (0.18.1) 2025-09-07T09:12:26.1709267Z Requirement already satisfied: pyre-extensions in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (0.0.32) 2025-09-07T09:12:26.1721757Z Requirement already satisfied: packaging in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from setuptools-git-versioning) (25.0) 2025-09-07T09:12:26.1726336Z Requirement already satisfied: setuptools in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from setuptools-git-versioning) (80.9.0) 2025-09-07T09:12:26.1733467Z Requirement already satisfied: tomli>=2.0.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from setuptools-git-versioning) (2.2.1) 2025-09-07T09:12:26.1778176Z Requirement already satisfied: distro in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from scikit-build) (1.9.0) 2025-09-07T09:12:26.1786964Z Requirement already satisfied: wheel>=0.32.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from scikit-build) (0.45.1) 2025-09-07T09:12:26.1794906Z Requirement already satisfied: typing-inspect in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from pyre-extensions) (0.9.0) 2025-09-07T09:12:26.1798339Z Requirement already satisfied: typing-extensions in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from pyre-extensions) (4.15.0) 2025-09-07T09:12:26.1884502Z Requirement already satisfied: mypy-extensions>=0.3.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from typing-inspect->pyre-extensions) (1.1.0) 2025-09-07T09:12:27.2010187Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm80 == *rocm* ]] 2025-09-07T09:12:27.2011001Z + pip_build_and_install git+https://github.com/pytorch/torchrec.git@6cd9fd362514d14ebb9ed51314c62ac1e1e2bbf2 dist/torchrec 2025-09-07T09:12:27.2011985Z + local build_target=git+https://github.com/pytorch/torchrec.git@6cd9fd362514d14ebb9ed51314c62ac1e1e2bbf2 2025-09-07T09:12:27.2012591Z + local wheel_dir=dist/torchrec 2025-09-07T09:12:27.2012865Z + local found_whl=0 2025-09-07T09:12:27.2013124Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:27.2013492Z + [[ -f dist/torchrec/torchrec-0.3.2-py3-none-any.whl ]] 2025-09-07T09:12:27.2013860Z + found_whl=1 2025-09-07T09:12:27.2014065Z + break 2025-09-07T09:12:27.2014616Z + '[' 1 == 0 ']' 2025-09-07T09:12:27.2014870Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:27.2015260Z + pip_install_whl dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T09:12:27.2015755Z + args=('dist/torchrec/torchrec-0.3.2-py3-none-any.whl') 2025-09-07T09:12:27.2016123Z + local args 2025-09-07T09:12:27.2016438Z + [[ dist/torchrec/torchrec-0.3.2-py3-none-any.whl == *\ * ]] 2025-09-07T09:12:27.2016846Z + for path in "${args[@]}" 2025-09-07T09:12:27.2017282Z + echo 'Installing dist/torchrec/torchrec-0.3.2-py3-none-any.whl' 2025-09-07T09:12:27.2017796Z Installing dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T09:12:27.2018382Z + python3 -mpip install --no-index --no-deps dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T09:12:27.5619840Z Processing ./dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T09:12:27.5692546Z Installing collected packages: torchrec 2025-09-07T09:12:28.7702377Z Successfully installed torchrec-0.3.2 2025-09-07T09:12:28.8092444Z + pip_build_and_install git+https://github.com/pytorch/FBGEMM.git@de731af65b4f04696e85c729e3282450b51b95fd#subdirectory=fbgemm_gpu dist/fbgemm_gpu 2025-09-07T09:12:28.8093633Z + local build_target=git+https://github.com/pytorch/FBGEMM.git@de731af65b4f04696e85c729e3282450b51b95fd#subdirectory=fbgemm_gpu 2025-09-07T09:12:28.8094348Z + local wheel_dir=dist/fbgemm_gpu 2025-09-07T09:12:28.8094639Z + local found_whl=0 2025-09-07T09:12:28.8094898Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:28.8095357Z + [[ -f dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl ]] 2025-09-07T09:12:28.8096196Z + found_whl=1 2025-09-07T09:12:28.8096407Z + break 2025-09-07T09:12:28.8096631Z + '[' 1 == 0 ']' 2025-09-07T09:12:28.8096879Z + for file in "${wheel_dir}"/*.whl 2025-09-07T09:12:28.8097467Z + pip_install_whl dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:28.8098144Z + args=('dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl') 2025-09-07T09:12:28.8098597Z + local args 2025-09-07T09:12:28.8099010Z + [[ dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-09-07T09:12:28.8099506Z + for path in "${args[@]}" 2025-09-07T09:12:28.8099980Z + echo 'Installing dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl' 2025-09-07T09:12:28.8100652Z Installing dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:28.8101438Z + python3 -mpip install --no-index --no-deps dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:29.4655290Z Processing ./dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T09:12:29.9966773Z Installing collected packages: fbgemm-gpu 2025-09-07T09:12:33.1718622Z Successfully installed fbgemm-gpu-0.4.1.post421 2025-09-07T09:12:33.2366069Z + PYTHONPATH=/torchbench 2025-09-07T09:12:33.2366414Z + test_dynamo_benchmark torchbench 5 2025-09-07T09:12:33.2370506Z ++ pwd 2025-09-07T09:12:33.2372738Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T09:12:33.2373198Z + local suite=torchbench 2025-09-07T09:12:33.2373456Z + shift 2025-09-07T09:12:33.2373687Z + local shard_id=5 2025-09-07T09:12:33.2373931Z + shift 2025-09-07T09:12:33.2374191Z + [[ inductor_torchbench_perf == *perf_compare* ]] 2025-09-07T09:12:33.2374561Z + [[ inductor_torchbench_perf == *perf* ]] 2025-09-07T09:12:33.2374962Z + [[ inductor_torchbench_perf == *b200* ]] 2025-09-07T09:12:33.2375405Z + test_single_dynamo_benchmark dashboard torchbench 5 2025-09-07T09:12:33.2377571Z ++ pwd 2025-09-07T09:12:33.2379610Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T09:12:33.2380145Z + mkdir -p /var/lib/jenkins/workspace/test/test-reports 2025-09-07T09:12:33.2396287Z + local name=dashboard 2025-09-07T09:12:33.2396620Z + shift 2025-09-07T09:12:33.2396841Z + local suite=torchbench 2025-09-07T09:12:33.2397084Z + shift 2025-09-07T09:12:33.2397347Z + local shard_id=5 2025-09-07T09:12:33.2397584Z + shift 2025-09-07T09:12:33.2398233Z + partition_flags=() 2025-09-07T09:12:33.2398514Z + local partition_flags 2025-09-07T09:12:33.2398808Z + [[ -n 6 ]] 2025-09-07T09:12:33.2399059Z + [[ -n 5 ]] 2025-09-07T09:12:33.2399472Z + partition_flags=(--total-partitions "$NUM_TEST_SHARDS" --partition-id "$shard_id") 2025-09-07T09:12:33.2400078Z + [[ inductor_torchbench_perf == *perf_compare* ]] 2025-09-07T09:12:33.2400516Z + [[ inductor_torchbench_perf == *perf* ]] 2025-09-07T09:12:33.2401045Z + test_perf_for_dashboard torchbench --device cuda --total-partitions 6 --partition-id 5 2025-09-07T09:12:33.2401631Z ++ pwd 2025-09-07T09:12:33.2403177Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T09:12:33.2403710Z + mkdir -p /var/lib/jenkins/workspace/test/test-reports 2025-09-07T09:12:33.2415885Z + local suite=torchbench 2025-09-07T09:12:33.2416166Z + shift 2025-09-07T09:12:33.2416384Z + local backend=inductor 2025-09-07T09:12:33.2416640Z + modes=() 2025-09-07T09:12:33.2416855Z + local modes 2025-09-07T09:12:33.2418253Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *training-true* ]] 2025-09-07T09:12:33.2419632Z + modes+=(training) 2025-09-07T09:12:33.2420942Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *inference-true* ]] 2025-09-07T09:12:33.2422553Z + modes+=(inference) 2025-09-07T09:12:33.2422830Z + targets=('accuracy' 'performance') 2025-09-07T09:12:33.2423140Z + local targets 2025-09-07T09:12:33.2423371Z + local device=cuda 2025-09-07T09:12:33.2423622Z + [[ inductor_torchbench_perf == *cpu* ]] 2025-09-07T09:12:33.2423981Z + [[ inductor_torchbench_perf == *cuda_a10g* ]] 2025-09-07T09:12:33.2424344Z + [[ inductor_torchbench_perf == *h100* ]] 2025-09-07T09:12:33.2424692Z + [[ inductor_torchbench_perf == *b200* ]] 2025-09-07T09:12:33.2425021Z + [[ inductor_torchbench_perf == *rocm* ]] 2025-09-07T09:12:33.2425349Z + for mode in "${modes[@]}" 2025-09-07T09:12:33.2425635Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T09:12:33.2425952Z + [[ training == \t\r\a\i\n\i\n\g ]] 2025-09-07T09:12:33.2426233Z + dtype=amp 2025-09-07T09:12:33.2426464Z + for target in "${targets[@]}" 2025-09-07T09:12:33.2426762Z + target_flag=('--accuracy') 2025-09-07T09:12:33.2427039Z + local target_flag 2025-09-07T09:12:33.2427284Z + [[ accuracy == \p\e\r\f\o\r\m\a\n\c\e ]] 2025-09-07T09:12:33.2427611Z + [[ accuracy == \a\c\c\u\r\a\c\y ]] 2025-09-07T09:12:33.2427949Z + target_flag+=(--no-translation-validation) 2025-09-07T09:12:33.2429334Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing-true* ]] 2025-09-07T09:12:33.2431738Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *default-true* ]] 2025-09-07T09:12:33.2434269Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_accuracy.csv 2025-09-07T09:12:33.8533158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:33.8534581Z import pynvml # type: ignore[import] 2025-09-07T09:12:37.9092687Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:37.9094134Z import pynvml # type: ignore[import] 2025-09-07T09:12:40.7280780Z 2025-09-07T09:12:41.8283900Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:12:41.8284279Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:12:41.8284613Z cuda train soft_actor_critic 2025-09-07T09:12:49.1600989Z W0907 09:12:49.159000 204 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:12:50.6280538Z pass 2025-09-07T09:12:53.4122111Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:12:53.4123738Z import pynvml # type: ignore[import] 2025-09-07T09:12:56.1599567Z 2025-09-07T09:12:57.8696750Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:12:57.8697248Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:12:57.8697598Z cuda train speech_transformer 2025-09-07T09:14:01.5841882Z pass 2025-09-07T09:14:05.2230634Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:14:05.2232501Z import pynvml # type: ignore[import] 2025-09-07T09:14:08.0933670Z 2025-09-07T09:14:08.2969156Z loading model: 0it [00:00, ?it/s]Downloading: "https://download.pytorch.org/models/squeezenet1_1-b8a52dc0.pth" to /var/lib/jenkins/.cache/torch/hub/checkpoints/squeezenet1_1-b8a52dc0.pth 2025-09-07T09:14:08.3204625Z 2025-09-07T09:14:08.3205380Z 2025-09-07T09:14:08.3542316Z 0% 0.00/4.73M [00:00 will be ignored 2025-09-07T09:20:44.3888821Z E0907 09:20:44.388000 10466 site-packages/torch/_dynamo/utils.py:3115] RMSE (res-fp64): 4.97162, (ref-fp64): 1.58286 and shape=torch.Size([1152]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.080000, use_larger_multiplier_for_smaller_tensor: 1 2025-09-07T09:20:44.3892224Z E0907 09:20:44.388000 10466 site-packages/torch/_dynamo/utils.py:2976] Accuracy failed for key name blocks.6.0.bn1.weight.grad 2025-09-07T09:20:44.4194986Z fail_accuracy 2025-09-07T09:20:49.8983995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:20:49.8986682Z import pynvml # type: ignore[import] 2025-09-07T09:20:52.6109846Z 2025-09-07T09:20:54.6903907Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:20:54.6904210Z 2025-09-07T09:20:54.7947311Z model.safetensors: 0% 0.00/286M [00:00 will be ignored 2025-09-07T09:23:16.9427658Z W0907 09:23:16.942000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9436291Z W0907 09:23:16.943000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9444260Z W0907 09:23:16.944000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9451678Z W0907 09:23:16.944000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9459147Z W0907 09:23:16.945000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9466587Z W0907 09:23:16.946000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9473997Z W0907 09:23:16.947000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9481103Z W0907 09:23:16.947000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9489161Z W0907 09:23:16.948000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9496276Z W0907 09:23:16.949000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9503575Z W0907 09:23:16.950000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9510936Z W0907 09:23:16.950000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9527305Z W0907 09:23:16.952000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9534251Z W0907 09:23:16.953000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9542007Z W0907 09:23:16.953000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9549072Z W0907 09:23:16.954000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9556178Z W0907 09:23:16.955000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9564113Z W0907 09:23:16.956000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9571373Z W0907 09:23:16.956000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9578544Z W0907 09:23:16.957000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9586158Z W0907 09:23:16.958000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9593376Z W0907 09:23:16.959000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9600582Z W0907 09:23:16.959000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9613233Z W0907 09:23:16.960000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9620002Z W0907 09:23:16.961000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9627735Z W0907 09:23:16.962000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9634835Z W0907 09:23:16.963000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9641743Z W0907 09:23:16.963000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:16.9649674Z W0907 09:23:16.964000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0156084Z W0907 09:23:17.015000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0164532Z W0907 09:23:17.016000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0172811Z W0907 09:23:17.016000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0180821Z W0907 09:23:17.017000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0188435Z W0907 09:23:17.018000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0196523Z W0907 09:23:17.019000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0204558Z W0907 09:23:17.020000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0211999Z W0907 09:23:17.020000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0220174Z W0907 09:23:17.021000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0227704Z W0907 09:23:17.022000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0235328Z W0907 09:23:17.023000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0242995Z W0907 09:23:17.023000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0259635Z W0907 09:23:17.025000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0267035Z W0907 09:23:17.026000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0274997Z W0907 09:23:17.027000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0282640Z W0907 09:23:17.027000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0290558Z W0907 09:23:17.028000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0298649Z W0907 09:23:17.029000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0306341Z W0907 09:23:17.030000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0313828Z W0907 09:23:17.031000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0322073Z W0907 09:23:17.031000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0329768Z W0907 09:23:17.032000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0337463Z W0907 09:23:17.033000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0349829Z W0907 09:23:17.034000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0357221Z W0907 09:23:17.035000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0365720Z W0907 09:23:17.036000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0373165Z W0907 09:23:17.037000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0380912Z W0907 09:23:17.037000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.0388825Z W0907 09:23:17.038000 15535 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:23:17.1575388Z pass 2025-09-07T09:23:22.5794359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:23:22.5807874Z import pynvml # type: ignore[import] 2025-09-07T09:23:25.2827990Z 2025-09-07T09:23:26.7486877Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:23:26.7487179Z 2025-09-07T09:23:26.8762633Z model.safetensors: 0% 0.00/42.6M [00:00 will be ignored 2025-09-07T09:27:24.1004651Z pass 2025-09-07T09:27:30.8425976Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:27:30.8427402Z import pynvml # type: ignore[import] 2025-09-07T09:27:33.8181998Z 2025-09-07T09:27:34.8692198Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:27:34.8692578Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:27:34.8692911Z cuda train tts_angular 2025-09-07T09:27:39.9720480Z W0907 09:27:39.971000 22468 site-packages/torch/_logging/_internal.py:1199] [10/0] Profiler function will be ignored 2025-09-07T09:27:43.7511001Z pass 2025-09-07T09:27:46.8949192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:27:46.8950625Z import pynvml # type: ignore[import] 2025-09-07T09:27:49.7698942Z 2025-09-07T09:27:50.8768558Z loading model: 0it [00:00, ?it/s]Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /var/lib/jenkins/.cache/torch/hub/checkpoints/vgg16-397923af.pth 2025-09-07T09:27:50.8932920Z 2025-09-07T09:27:50.8932957Z 2025-09-07T09:27:50.9939136Z 0% 0.00/528M [00:00 2025-09-07T09:30:15.3369739Z W0907 09:30:15.334000 23780 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T09:30:15.3371153Z W0907 09:30:15.334000 23780 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T09:30:15.3372528Z W0907 09:30:15.334000 23780 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T09:30:15.3373423Z W0907 09:30:15.334000 23780 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T09:30:15.3374100Z W0907 09:30:15.334000 23780 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T09:30:47.4280326Z W0907 09:30:47.426000 23780 site-packages/torch/fx/experimental/symbolic_shapes.py:2396] [30/2] RecursionError in sympy.xreplace(Eq(Mod(2*s29, s93), 0), {s29: evaluate_static_shape_0 + 1, s93: evaluate_static_shape_1 + 1}) 2025-09-07T09:31:18.3746897Z W0907 09:31:18.373000 23780 site-packages/torch/_logging/_internal.py:1199] [51/0] Profiler function will be ignored 2025-09-07T09:31:48.8120648Z W0907 09:31:48.811000 23780 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:31:48.8405382Z W0907 09:31:48.840000 23780 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:31:48.9367899Z pass 2025-09-07T09:31:53.9834331Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:31:53.9835755Z import pynvml # type: ignore[import] 2025-09-07T09:31:56.5974973Z 2025-09-07T09:31:59.8128253Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:31:59.8128637Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:31:59.8128963Z cuda train yolov3 2025-09-07T09:33:16.0129232Z W0907 09:33:16.011000 28666 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:33:56.8773093Z pass 2025-09-07T09:34:02.1837305Z accuracy pass_rate=83.33% 2025-09-07T09:34:02.1844374Z calls_captured gmean=0.00x mean=784.000x 2025-09-07T09:34:02.1848869Z unique_graphs gmean=0.00x mean=5.889x 2025-09-07T09:34:02.1853255Z graph_breaks gmean=0.00x mean=8.722x 2025-09-07T09:34:02.1858493Z unique_graph_breaks gmean=0.00x mean=4.889x 2025-09-07T09:34:02.1863100Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T09:34:02.1867283Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T09:34:02.1871547Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T09:34:02.1872829Z compilation_latency mean=52.782 seconds 2025-09-07T09:34:03.1458474Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs-true* ]] 2025-09-07T09:34:03.1461110Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_accuracy.csv 2025-09-07T09:34:03.7425503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:34:03.7426993Z import pynvml # type: ignore[import] 2025-09-07T09:34:07.5396615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:34:07.5398053Z import pynvml # type: ignore[import] 2025-09-07T09:34:10.3478065Z 2025-09-07T09:34:11.4783607Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:34:11.4783988Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:34:11.4784323Z cuda train soft_actor_critic 2025-09-07T09:34:17.1175121Z W0907 09:34:17.116000 31520 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:34:18.3437698Z pass 2025-09-07T09:34:21.1705808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:34:21.1707217Z import pynvml # type: ignore[import] 2025-09-07T09:34:23.9587979Z 2025-09-07T09:34:25.6706350Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:34:25.6706727Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:34:25.6707071Z cuda train speech_transformer 2025-09-07T09:34:38.8975808Z W0907 09:34:38.896000 31658 site-packages/torch/_inductor/utils.py:2298] [9/0_1] DeviceCopy in input program 2025-09-07T09:34:40.5531347Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5532098Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5532729Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5533307Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5533919Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5534521Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5535121Z cudagraph partition due to non gpu ops 2025-09-07T09:34:40.5535734Z cudagraph partition due to DeviceCopy ops 2025-09-07T09:34:40.6002609Z cudagraph partition into 2 partitions 2025-09-07T09:35:04.1758187Z W0907 09:35:04.175000 31658 site-packages/torch/_inductor/utils.py:2298] [15/0_1] DeviceCopy in input program 2025-09-07T09:35:06.3491610Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3492013Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3492362Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3492798Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3493160Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3493512Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3493855Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3494198Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3494538Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3494865Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3495202Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3495542Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3495881Z cudagraph partition due to non gpu ops 2025-09-07T09:35:06.3496227Z cudagraph partition due to DeviceCopy ops 2025-09-07T09:35:06.4267822Z cudagraph partition into 2 partitions 2025-09-07T09:35:10.9559463Z W0907 09:35:10.955000 31658 site-packages/torch/_inductor/utils.py:2298] [15/0_1] DeviceCopy in input program 2025-09-07T09:35:14.6308559Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6308994Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6309343Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6310181Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6310555Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6310884Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6311220Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6311554Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6311895Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6312220Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6312560Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6312901Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6313254Z cudagraph partition due to non gpu ops 2025-09-07T09:35:14.6313591Z cudagraph partition due to DeviceCopy ops 2025-09-07T09:35:15.2225079Z cudagraph partition into 2 partitions 2025-09-07T09:35:18.6147406Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/torchbench/torchbenchmark/models/speech_transformer/speech_transformer/transformer/decoder.py", line 126, in torch_dynamo_resume_in_forward_at_120 2025-09-07T09:35:18.6148893Z self.tgt_word_emb(ys_in_pad) * self.x_logit_scale 2025-09-07T09:35:18.6149158Z 2025-09-07T09:35:18.6149162Z 2025-09-07T09:35:19.0684929Z Run failed with return code: -11 2025-09-07T09:35:19.0685321Z Output: None 2025-09-07T09:35:19.0685550Z Error: None 2025-09-07T09:35:19.6797214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:35:19.6800472Z import pynvml # type: ignore[import] 2025-09-07T09:35:22.4296102Z 2025-09-07T09:35:23.8865587Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:35:23.8865949Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:35:23.8866284Z cuda train squeezenet1_1 2025-09-07T09:35:43.6048699Z pass 2025-09-07T09:35:47.1689807Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:35:47.1692303Z import pynvml # type: ignore[import] 2025-09-07T09:35:49.8558347Z 2025-09-07T09:35:51.6953928Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:35:51.6954374Z 2025-09-07T09:35:51.9616934Z Loading pipeline components...: 0% 0/6 [00:00 will be ignored 2025-09-07T09:41:43.9069828Z E0907 09:41:43.906000 33336 site-packages/torch/_dynamo/utils.py:3115] RMSE (res-fp64): 4.87951, (ref-fp64): 1.58286 and shape=torch.Size([1152]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.080000, use_larger_multiplier_for_smaller_tensor: 1 2025-09-07T09:41:43.9073030Z E0907 09:41:43.906000 33336 site-packages/torch/_dynamo/utils.py:2976] Accuracy failed for key name blocks.6.0.bn1.weight.grad 2025-09-07T09:41:43.9367711Z fail_accuracy 2025-09-07T09:41:49.2925839Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:41:49.2928653Z import pynvml # type: ignore[import] 2025-09-07T09:41:52.0615375Z 2025-09-07T09:41:55.8164934Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:41:55.8165331Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:41:55.8165695Z cuda train timm_nfnet 2025-09-07T09:42:40.4834392Z pass 2025-09-07T09:42:44.5614163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:42:44.5615565Z import pynvml # type: ignore[import] 2025-09-07T09:42:47.5804051Z 2025-09-07T09:42:51.3132195Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:42:51.3133034Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:42:51.3133364Z cuda train timm_regnet 2025-09-07T09:43:29.9671135Z W0907 09:43:29.966000 33616 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:44:12.3816611Z W0907 09:44:12.380000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3824458Z W0907 09:44:12.382000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3836851Z W0907 09:44:12.383000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3844064Z W0907 09:44:12.384000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3851306Z W0907 09:44:12.384000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3858730Z W0907 09:44:12.385000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3865809Z W0907 09:44:12.386000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3872889Z W0907 09:44:12.387000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3879993Z W0907 09:44:12.387000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3896086Z W0907 09:44:12.389000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3903232Z W0907 09:44:12.390000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3910684Z W0907 09:44:12.390000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3917748Z W0907 09:44:12.391000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3925423Z W0907 09:44:12.392000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3932436Z W0907 09:44:12.392000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3939776Z W0907 09:44:12.393000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3946700Z W0907 09:44:12.394000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3954106Z W0907 09:44:12.395000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3961149Z W0907 09:44:12.395000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3968835Z W0907 09:44:12.396000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3980293Z W0907 09:44:12.397000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3987173Z W0907 09:44:12.398000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.3994687Z W0907 09:44:12.399000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4001754Z W0907 09:44:12.399000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4009084Z W0907 09:44:12.400000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4016428Z W0907 09:44:12.401000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4036674Z W0907 09:44:12.403000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4043905Z W0907 09:44:12.404000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4534729Z W0907 09:44:12.453000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4542458Z W0907 09:44:12.453000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4549914Z W0907 09:44:12.454000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4557438Z W0907 09:44:12.455000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4565084Z W0907 09:44:12.456000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4572946Z W0907 09:44:12.457000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4580392Z W0907 09:44:12.457000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4587702Z W0907 09:44:12.458000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4595155Z W0907 09:44:12.459000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4611479Z W0907 09:44:12.460000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4619137Z W0907 09:44:12.461000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4626631Z W0907 09:44:12.462000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4634095Z W0907 09:44:12.463000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4641395Z W0907 09:44:12.463000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4649709Z W0907 09:44:12.464000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4657199Z W0907 09:44:12.465000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4664695Z W0907 09:44:12.466000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4672502Z W0907 09:44:12.466000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4679932Z W0907 09:44:12.467000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4687824Z W0907 09:44:12.468000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4699731Z W0907 09:44:12.469000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4707050Z W0907 09:44:12.470000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4715052Z W0907 09:44:12.471000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4722521Z W0907 09:44:12.471000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4730294Z W0907 09:44:12.472000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4738085Z W0907 09:44:12.473000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4758303Z W0907 09:44:12.475000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.4766170Z W0907 09:44:12.476000 33616 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:44:12.5908291Z pass 2025-09-07T09:44:18.4209678Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:44:18.4211105Z import pynvml # type: ignore[import] 2025-09-07T09:44:21.2763661Z 2025-09-07T09:44:24.5658709Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:44:24.5659107Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:44:24.5659425Z cuda train timm_resnest 2025-09-07T09:44:57.1781545Z pass 2025-09-07T09:45:00.7048021Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:45:00.7049495Z import pynvml # type: ignore[import] 2025-09-07T09:45:03.4373652Z 2025-09-07T09:45:05.8768106Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:45:05.8768694Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:45:05.8769259Z cuda train timm_vision_transformer 2025-09-07T09:45:29.9963187Z pass 2025-09-07T09:45:33.6139869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:45:33.6142487Z import pynvml # type: ignore[import] 2025-09-07T09:45:36.5758021Z 2025-09-07T09:45:50.8480049Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:45:50.8480443Z loading model: 0it [00:14, ?it/s] 2025-09-07T09:45:50.8480816Z cuda train timm_vision_transformer_large 2025-09-07T09:45:50.8714739Z pass_due_to_skip 2025-09-07T09:45:52.6873884Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:45:52.6875338Z import pynvml # type: ignore[import] 2025-09-07T09:45:55.4000739Z 2025-09-07T09:45:58.2351103Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:45:58.2351498Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:45:58.2351829Z cuda train timm_vovnet 2025-09-07T09:46:29.9000128Z pass 2025-09-07T09:46:33.7455955Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:46:33.7457958Z import pynvml # type: ignore[import] 2025-09-07T09:46:36.5312595Z 2025-09-07T09:46:38.8660767Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:46:38.8661171Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:46:38.8661507Z cuda train torch_multimodal_clip 2025-09-07T09:47:27.7569829Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:47:27.7571112Z pred = mod(*cloned_inputs) 2025-09-07T09:47:27.7571764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/model.py", line 72, in forward 2025-09-07T09:47:27.7572468Z embeddings_b = self.encoder_b(features_b) 2025-09-07T09:47:27.7573182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/text_encoder.py", line 132, in forward 2025-09-07T09:47:27.7574025Z hidden_state[torch.arange(hidden_state.shape[0]), text.argmax(dim=-1)] 2025-09-07T09:47:27.7574386Z 2025-09-07T09:47:27.7574391Z 2025-09-07T09:47:28.0011283Z W0907 09:47:28.000000 34220 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:48:15.3016072Z pass 2025-09-07T09:48:21.6205340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:48:21.6206769Z import pynvml # type: ignore[import] 2025-09-07T09:48:24.4404758Z 2025-09-07T09:48:25.4799408Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:48:25.4799798Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:48:25.4800163Z cuda train tts_angular 2025-09-07T09:48:29.9960414Z W0907 09:48:29.995000 34358 site-packages/torch/_logging/_internal.py:1199] [10/0] Profiler function will be ignored 2025-09-07T09:48:30.5794221Z pass 2025-09-07T09:48:33.4349995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:48:33.4351752Z import pynvml # type: ignore[import] 2025-09-07T09:48:36.3538957Z 2025-09-07T09:48:40.0821466Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:48:40.0821857Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:48:40.0823864Z cuda train vgg16 2025-09-07T09:48:55.0350423Z pass 2025-09-07T09:48:58.5959489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:48:58.5960948Z import pynvml # type: ignore[import] 2025-09-07T09:49:01.5727572Z 2025-09-07T09:49:17.1769629Z loading model: 0it [00:00, ?it/s]skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:49:17.1771072Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:49:17.1771977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:49:17.1772678Z v4 = masked_index(y_high, x_high) 2025-09-07T09:49:17.1773698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:49:17.1774334Z return input[ 2025-09-07T09:49:17.1774495Z 2025-09-07T09:49:17.1774500Z 2025-09-07T09:49:30.5992309Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:49:30.5994722Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:49:30.5996223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:49:30.5997506Z v4 = masked_index(y_high, x_high) 2025-09-07T09:49:30.5998595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:49:30.5999701Z return input[ 2025-09-07T09:49:30.5999981Z 2025-09-07T09:49:30.5999989Z 2025-09-07T09:49:39.3561271Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:49:39.3562600Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:49:39.3563743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:49:39.3564451Z v4 = masked_index(y_high, x_high) 2025-09-07T09:49:39.3565071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:49:39.3565710Z return input[ 2025-09-07T09:49:39.3565870Z 2025-09-07T09:49:39.3565875Z 2025-09-07T09:49:47.9275918Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:49:47.9277265Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:49:47.9278165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:49:47.9278872Z v4 = masked_index(y_high, x_high) 2025-09-07T09:49:47.9279886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:49:47.9280536Z return input[ 2025-09-07T09:49:47.9280694Z 2025-09-07T09:49:47.9280699Z 2025-09-07T09:49:58.9203521Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:49:58.9204825Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:49:58.9205752Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:49:58.9206450Z v4 = masked_index(y_high, x_high) 2025-09-07T09:49:58.9207090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:49:58.9207731Z return input[ 2025-09-07T09:49:58.9207875Z 2025-09-07T09:49:58.9207880Z 2025-09-07T09:50:09.4978710Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:50:09.4980034Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:50:09.4980936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:50:09.4981993Z v4 = masked_index(y_high, x_high) 2025-09-07T09:50:09.4982637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:50:09.4983271Z return input[ 2025-09-07T09:50:09.4983416Z 2025-09-07T09:50:09.4983421Z 2025-09-07T09:50:22.6643894Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:50:22.6645227Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:50:22.6646195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:50:22.6646902Z v4 = masked_index(y_high, x_high) 2025-09-07T09:50:22.6647543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:50:22.6648180Z return input[ 2025-09-07T09:50:22.6648339Z 2025-09-07T09:50:22.6648344Z 2025-09-07T09:50:27.9185446Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:50:27.9186778Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:50:27.9187675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:50:27.9188365Z v4 = masked_index(y_high, x_high) 2025-09-07T09:50:27.9189009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:50:27.9189646Z return input[ 2025-09-07T09:50:27.9189804Z 2025-09-07T09:50:27.9189809Z 2025-09-07T09:50:29.3500644Z 2025-09-07T09:50:29.3501206Z loading model: 0it [01:27, ?it/s] 2025-09-07T09:50:29.3501781Z cuda train vision_maskrcnn 2025-09-07T09:50:29.5127749Z W0907 09:50:29.512000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T09:50:29.5130254Z W0907 09:50:29.512000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] function: '_roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:114) 2025-09-07T09:50:29.5132349Z W0907 09:50:29.512000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] last reason: 0/7: tensor 'rois' dtype mismatch. expected Float, actual Double 2025-09-07T09:50:29.5134065Z W0907 09:50:29.512000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T09:50:29.5136050Z W0907 09:50:29.512000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T09:50:49.9885505Z W0907 09:50:49.987000 34584 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T09:50:49.9898488Z W0907 09:50:49.989000 34584 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T09:50:49.9909141Z W0907 09:50:49.990000 34584 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T09:50:49.9919554Z W0907 09:50:49.991000 34584 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T09:50:49.9930697Z W0907 09:50:49.992000 34584 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T09:50:50.6616383Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T09:50:50.6617180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T09:50:50.6618400Z anchors = self.anchor_generator(images, features) 2025-09-07T09:50:50.6619155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T09:50:50.6619925Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T09:50:50.6620733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T09:50:50.6621669Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6622568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T09:50:50.6623469Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6623890Z 2025-09-07T09:50:50.6624076Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T09:50:50.6624812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T09:50:50.6625525Z anchors = self.anchor_generator(images, features) 2025-09-07T09:50:50.6626267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T09:50:50.6627034Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T09:50:50.6627828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T09:50:50.6628760Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6629669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T09:50:50.6630560Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6630983Z 2025-09-07T09:50:50.6631164Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T09:50:50.6631899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T09:50:50.6632611Z anchors = self.anchor_generator(images, features) 2025-09-07T09:50:50.6633569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T09:50:50.6634329Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T09:50:50.6635126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T09:50:50.6636055Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6636967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T09:50:50.6637875Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6638283Z 2025-09-07T09:50:50.6638462Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T09:50:50.6639195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T09:50:50.6639916Z anchors = self.anchor_generator(images, features) 2025-09-07T09:50:50.6640667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T09:50:50.6641424Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T09:50:50.6642198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T09:50:50.6643398Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6644491Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T09:50:50.6645392Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6645793Z 2025-09-07T09:50:50.6646009Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T09:50:50.6646739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T09:50:50.6647450Z anchors = self.anchor_generator(images, features) 2025-09-07T09:50:50.6648198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T09:50:50.6648959Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T09:50:50.6649755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T09:50:50.6650674Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6651578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T09:50:50.6652483Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T09:50:50.6652886Z 2025-09-07T09:50:50.6844756Z cudagraph partition into 2 partitions 2025-09-07T09:50:52.6614593Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T09:50:52.6615737Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T09:50:52.6616638Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T09:50:52.6617640Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T09:50:52.6618672Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T09:50:52.6619542Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T09:50:52.6620808Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T09:50:52.6622343Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T09:50:52.6624207Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T09:50:52.6625806Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T09:50:52.6627302Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T09:50:52.6628693Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T09:50:52.6629980Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T09:50:52.6631311Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T09:50:52.6632921Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T09:50:52.6634335Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T09:50:52.6635740Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T09:50:52.6637150Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T09:50:52.6638521Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T09:50:52.6639405Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T09:50:52.6640076Z W0907 09:50:52.660000 34584 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T09:50:52.9794099Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T09:50:52.9794984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T09:50:52.9795885Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T09:50:52.9796247Z 2025-09-07T09:50:53.2456102Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T09:50:53.2456980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T09:50:53.2457967Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T09:50:53.2458324Z 2025-09-07T09:51:00.7729971Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:00.7731861Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:00.7732800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:00.7733503Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:00.7734122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:00.7734964Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:00.7735868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:00.7736551Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:00.7737185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:00.7737883Z return input[ 2025-09-07T09:51:00.7738037Z 2025-09-07T09:51:00.7738043Z 2025-09-07T09:51:05.6589263Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:05.6590826Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:05.6591764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:05.6592863Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:05.6593490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:05.6594336Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:05.6595234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:05.6595933Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:05.6596567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:05.6597201Z return input[ 2025-09-07T09:51:05.6597348Z 2025-09-07T09:51:05.6597352Z 2025-09-07T09:51:10.3533170Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:10.3534709Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:10.3535632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:10.3536345Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:10.3536966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:10.3537893Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:10.3538797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:10.3539489Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:10.3540129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:10.3540753Z return input[ 2025-09-07T09:51:10.3540896Z 2025-09-07T09:51:10.3540901Z 2025-09-07T09:51:25.7226425Z W0907 09:51:25.721000 34584 site-packages/torch/fx/experimental/symbolic_shapes.py:2396] [30/4] RecursionError in sympy.xreplace(Eq(Mod(2*s29, s93), 0), {s29: evaluate_static_shape_0 + 1, s93: evaluate_static_shape_1 + 1}) 2025-09-07T09:51:26.8765686Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:26.8767192Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:26.8768137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:26.8768856Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:26.8769487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:26.8770324Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:26.8771229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:26.8771930Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:26.8772549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:26.8773180Z return input[ 2025-09-07T09:51:26.8773325Z 2025-09-07T09:51:26.8773343Z 2025-09-07T09:51:33.5759232Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:33.5761099Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:33.5762054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:33.5762742Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:33.5763610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:33.5764456Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:33.5765342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:33.5766027Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:33.5766657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:33.5767286Z return input[ 2025-09-07T09:51:33.5767427Z 2025-09-07T09:51:33.5767443Z 2025-09-07T09:51:43.9194516Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:43.9196019Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:43.9196959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:43.9197646Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:43.9198272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:43.9199112Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:43.9200020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:43.9200719Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:43.9201339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:43.9202307Z return input[ 2025-09-07T09:51:43.9202475Z 2025-09-07T09:51:43.9202480Z 2025-09-07T09:51:47.3381863Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:47.3383343Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:47.3384283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:47.3384995Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:47.3385618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:47.3386459Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:47.3387360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:47.3388058Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:47.3388678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:47.3389310Z return input[ 2025-09-07T09:51:47.3389466Z 2025-09-07T09:51:47.3389471Z 2025-09-07T09:51:53.0377288Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T09:51:53.0379252Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T09:51:53.0379627Z 2025-09-07T09:51:53.0379632Z 2025-09-07T09:51:54.5970948Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T09:51:54.5972433Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T09:51:54.5972811Z 2025-09-07T09:51:54.5972816Z 2025-09-07T09:51:55.5762744Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 823, in torch_dynamo_resume_in_forward_at_806 2025-09-07T09:51:55.5764479Z masks_probs = maskrcnn_inference(mask_logits, labels) 2025-09-07T09:51:55.5765286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 79, in maskrcnn_inference 2025-09-07T09:51:55.5766078Z mask_prob = mask_prob[index, labels][:, None] 2025-09-07T09:51:55.5766345Z 2025-09-07T09:51:55.5766365Z 2025-09-07T09:51:57.3689698Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T09:51:57.3691190Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T09:51:57.3692127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T09:51:57.3692851Z return fn(*args[2:], **kwargs) 2025-09-07T09:51:57.3693474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:51:57.3694322Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:51:57.3695517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:51:57.3696229Z v4 = masked_index(y_high, x_high) 2025-09-07T09:51:57.3696855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:51:57.3697573Z return input[ 2025-09-07T09:51:57.3697732Z 2025-09-07T09:51:57.3697736Z 2025-09-07T09:52:01.2907500Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 114, in forward 2025-09-07T09:52:01.2908802Z features = self.backbone(images.tensors) 2025-09-07T09:52:01.2909544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/backbone_utils.py", line 58, in forward 2025-09-07T09:52:01.2910259Z x = self.fpn(x) 2025-09-07T09:52:01.2910884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/feature_pyramid_network.py", line 194, in forward 2025-09-07T09:52:01.2911726Z inner_top_down = F.interpolate(last_inner, size=feat_shape, mode="nearest") 2025-09-07T09:52:01.2912096Z 2025-09-07T09:52:01.2912101Z 2025-09-07T09:52:01.4331269Z W0907 09:52:01.432000 34584 site-packages/torch/_logging/_internal.py:1199] [51/0] Profiler function will be ignored 2025-09-07T09:52:13.5616420Z W0907 09:52:13.561000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T09:52:13.5618378Z W0907 09:52:13.561000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] function: 'torch_dynamo_resume_in_roi_align_at_255' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:255) 2025-09-07T09:52:13.5619894Z W0907 09:52:13.561000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] last reason: 30/7: tensor 'rois' requires_grad mismatch. expected requires_grad=1 2025-09-07T09:52:13.5621082Z W0907 09:52:13.561000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T09:52:13.5622408Z W0907 09:52:13.561000 34584 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T09:52:16.0903634Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:16.0904972Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:16.0905870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:16.0906573Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:16.0907224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:16.0907843Z return input[ 2025-09-07T09:52:16.0908001Z 2025-09-07T09:52:16.0908005Z 2025-09-07T09:52:19.2791691Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:19.2792990Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:19.2793921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:19.2794624Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:19.2795248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:19.2796268Z return input[ 2025-09-07T09:52:19.2796436Z 2025-09-07T09:52:19.2796441Z 2025-09-07T09:52:21.5754387Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:21.5755704Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:21.5756605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:21.5757324Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:21.5757962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:21.5758593Z return input[ 2025-09-07T09:52:21.5758737Z 2025-09-07T09:52:21.5758742Z 2025-09-07T09:52:23.8838399Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:23.8839738Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:23.8840644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:23.8841349Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:23.8842415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:23.8843270Z return input[ 2025-09-07T09:52:23.8843418Z 2025-09-07T09:52:23.8843423Z 2025-09-07T09:52:29.7311790Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:29.7313113Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:29.7314010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:29.7314696Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:29.7315331Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:29.7315959Z return input[ 2025-09-07T09:52:29.7316117Z 2025-09-07T09:52:29.7316122Z 2025-09-07T09:52:33.3459856Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:33.3461163Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:33.3462071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:33.3462778Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:33.3463417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:33.3464049Z return input[ 2025-09-07T09:52:33.3464194Z 2025-09-07T09:52:33.3464199Z 2025-09-07T09:52:37.9237808Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T09:52:37.9239148Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T09:52:37.9240052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T09:52:37.9240742Z v4 = masked_index(y_high, x_high) 2025-09-07T09:52:37.9241774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T09:52:37.9242418Z return input[ 2025-09-07T09:52:37.9242568Z 2025-09-07T09:52:37.9242572Z 2025-09-07T09:52:48.0739655Z W0907 09:52:48.073000 34584 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:52:48.1024453Z W0907 09:52:48.101000 34584 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T09:52:48.1974236Z pass 2025-09-07T09:52:54.0089187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:52:54.0091738Z import pynvml # type: ignore[import] 2025-09-07T09:52:56.7244868Z 2025-09-07T09:52:59.8320622Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:52:59.8321017Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:52:59.8321342Z cuda train yolov3 2025-09-07T09:53:32.8993892Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T09:53:32.8994779Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:53:32.8995550Z pred = mod(*cloned_inputs) 2025-09-07T09:53:32.8996433Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T09:53:32.8996969Z return self.forward_once(x) 2025-09-07T09:53:32.8997478Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T09:53:32.8998038Z yolo_out.append(module(x, out)) 2025-09-07T09:53:32.8998550Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T09:53:32.8999142Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T09:53:32.8999713Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T09:53:32.9000300Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T09:53:32.9000566Z 2025-09-07T09:53:32.9000728Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T09:53:32.9001541Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:53:32.9002309Z pred = mod(*cloned_inputs) 2025-09-07T09:53:32.9002787Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T09:53:32.9003574Z return self.forward_once(x) 2025-09-07T09:53:32.9004094Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T09:53:32.9004657Z yolo_out.append(module(x, out)) 2025-09-07T09:53:32.9005166Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T09:53:32.9005738Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T09:53:32.9006315Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T09:53:32.9006900Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T09:53:32.9007150Z 2025-09-07T09:53:32.9007325Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T09:53:32.9008150Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:53:32.9008920Z pred = mod(*cloned_inputs) 2025-09-07T09:53:32.9009409Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T09:53:32.9009941Z return self.forward_once(x) 2025-09-07T09:53:32.9010455Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T09:53:32.9011251Z yolo_out.append(module(x, out)) 2025-09-07T09:53:32.9011769Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T09:53:32.9012345Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T09:53:32.9012927Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T09:53:32.9013499Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T09:53:32.9013763Z 2025-09-07T09:53:33.1897921Z cudagraph partition into 2 partitions 2025-09-07T09:54:16.0406958Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:54:16.0408190Z pred = mod(*cloned_inputs) 2025-09-07T09:54:16.0408699Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T09:54:16.0409241Z return self.forward_once(x) 2025-09-07T09:54:16.0409783Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T09:54:16.0410320Z x = module(x) 2025-09-07T09:54:16.0410460Z 2025-09-07T09:54:16.0410465Z 2025-09-07T09:54:16.7344752Z W0907 09:54:16.733000 35973 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:54:55.0514685Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T09:54:55.0516573Z pred = mod(*cloned_inputs) 2025-09-07T09:54:55.0517077Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T09:54:55.0517603Z return self.forward_once(x) 2025-09-07T09:54:55.0518139Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T09:54:55.0518677Z x = module(x) 2025-09-07T09:54:55.0518817Z 2025-09-07T09:54:55.0518822Z 2025-09-07T09:54:56.3789733Z pass 2025-09-07T09:55:01.3619322Z accuracy pass_rate=82.35% 2025-09-07T09:55:01.3627538Z calls_captured gmean=0.00x mean=779.588x 2025-09-07T09:55:01.3631993Z unique_graphs gmean=0.00x mean=5.941x 2025-09-07T09:55:01.3636292Z graph_breaks gmean=0.00x mean=8.294x 2025-09-07T09:55:01.3640653Z unique_graph_breaks gmean=0.00x mean=4.765x 2025-09-07T09:55:01.3645317Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T09:55:01.3649491Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T09:55:01.3653717Z cudagraph_skips gmean=0.00x mean=1.059x 2025-09-07T09:55:01.3655130Z compilation_latency mean=52.336 seconds 2025-09-07T09:55:02.3150940Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *dynamic-true* ]] 2025-09-07T09:55:02.3153639Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --dynamic-shapes --dynamic-batch-only --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_accuracy.csv 2025-09-07T09:55:02.9282389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:02.9284069Z import pynvml # type: ignore[import] 2025-09-07T09:55:06.8876245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:06.8877665Z import pynvml # type: ignore[import] 2025-09-07T09:55:09.6454347Z 2025-09-07T09:55:10.8121498Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:55:10.8121883Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:55:10.8122220Z cuda train soft_actor_critic 2025-09-07T09:55:19.9321581Z W0907 09:55:19.931000 37080 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:55:20.3167246Z pass 2025-09-07T09:55:23.5766948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:23.5768376Z import pynvml # type: ignore[import] 2025-09-07T09:55:26.6353791Z 2025-09-07T09:55:28.2455973Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:55:28.2456353Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:55:28.2457422Z Traceback (most recent call last): 2025-09-07T09:55:28.2457984Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 508, in 2025-09-07T09:55:28.2459523Z torchbench_main() 2025-09-07T09:55:28.2460046Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 504, in torchbench_main 2025-09-07T09:55:28.2460685Z main(TorchBenchmarkRunner(), original_dir) 2025-09-07T09:55:28.2461655Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3636, in main 2025-09-07T09:55:28.2464789Z process_entry(0, runner, original_dir, args) 2025-09-07T09:55:28.2465380Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3561, in process_entry 2025-09-07T09:55:28.2468748Z result = run(runner, args, original_dir) 2025-09-07T09:55:28.2469289Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4251, in run 2025-09-07T09:55:28.2473564Z assert marked, f"nothing in example_inputs had a dim with {batch_size}" 2025-09-07T09:55:28.2474102Z AssertionError: nothing in example_inputs had a dim with 32 2025-09-07T09:55:29.1938164Z Run failed with return code: 1 2025-09-07T09:55:29.1938527Z Output: None 2025-09-07T09:55:29.1938760Z Error: None 2025-09-07T09:55:29.7981910Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:29.7983425Z import pynvml # type: ignore[import] 2025-09-07T09:55:32.8033050Z 2025-09-07T09:55:34.2910344Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:55:34.2910721Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:55:34.2911320Z cuda train squeezenet1_1 2025-09-07T09:55:44.0416905Z pass 2025-09-07T09:55:47.2406410Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:47.2407823Z import pynvml # type: ignore[import] 2025-09-07T09:55:50.2330278Z 2025-09-07T09:55:52.0417736Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:55:52.0418059Z 2025-09-07T09:55:52.3067545Z Loading pipeline components...: 0% 0/6 [00:00 will be ignored 2025-09-07T10:00:24.7476224Z pass 2025-09-07T10:00:28.0903958Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:00:28.0905465Z import pynvml # type: ignore[import] 2025-09-07T10:00:30.9524418Z 2025-09-07T10:00:34.6825523Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:00:34.6826496Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:00:34.6826941Z cuda train timm_nfnet 2025-09-07T10:00:48.9804831Z pass 2025-09-07T10:00:52.3912836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:00:52.3915589Z import pynvml # type: ignore[import] 2025-09-07T10:00:55.0954639Z 2025-09-07T10:00:58.9017314Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:00:58.9018046Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:00:58.9018747Z cuda train timm_regnet 2025-09-07T10:01:15.0407575Z W0907 10:01:15.039000 38428 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:01:21.9946646Z W0907 10:01:21.993000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:21.9954201Z W0907 10:01:21.995000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:21.9998393Z W0907 10:01:21.999000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0005840Z W0907 10:01:22.000000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0012636Z W0907 10:01:22.000000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0020051Z W0907 10:01:22.001000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0026886Z W0907 10:01:22.002000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0033911Z W0907 10:01:22.003000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0040735Z W0907 10:01:22.003000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0064121Z W0907 10:01:22.005000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0065564Z W0907 10:01:22.006000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0070884Z W0907 10:01:22.006000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0077648Z W0907 10:01:22.007000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0084742Z W0907 10:01:22.008000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0091894Z W0907 10:01:22.008000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0099029Z W0907 10:01:22.009000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0106061Z W0907 10:01:22.010000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0113091Z W0907 10:01:22.011000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0120077Z W0907 10:01:22.011000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0127429Z W0907 10:01:22.012000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0138932Z W0907 10:01:22.013000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0145733Z W0907 10:01:22.014000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0153166Z W0907 10:01:22.015000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0160080Z W0907 10:01:22.015000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0167334Z W0907 10:01:22.016000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0174583Z W0907 10:01:22.017000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0196607Z W0907 10:01:22.019000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0670166Z W0907 10:01:22.066000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0677207Z W0907 10:01:22.067000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0685111Z W0907 10:01:22.068000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0692462Z W0907 10:01:22.068000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0699882Z W0907 10:01:22.069000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0707531Z W0907 10:01:22.070000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0715000Z W0907 10:01:22.071000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0722440Z W0907 10:01:22.071000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0729886Z W0907 10:01:22.072000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0739539Z W0907 10:01:22.073000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0750929Z W0907 10:01:22.074000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0758141Z W0907 10:01:22.075000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0766195Z W0907 10:01:22.076000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0773465Z W0907 10:01:22.077000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0780883Z W0907 10:01:22.077000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0788321Z W0907 10:01:22.078000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0795728Z W0907 10:01:22.079000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0803115Z W0907 10:01:22.080000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0812630Z W0907 10:01:22.080000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0821274Z W0907 10:01:22.081000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0830239Z W0907 10:01:22.082000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0844546Z W0907 10:01:22.084000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0853394Z W0907 10:01:22.084000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0862663Z W0907 10:01:22.085000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0871590Z W0907 10:01:22.086000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0880019Z W0907 10:01:22.087000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0889571Z W0907 10:01:22.088000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.0917370Z W0907 10:01:22.091000 38428 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:01:22.2035747Z pass 2025-09-07T10:01:25.5613701Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:01:25.5615321Z import pynvml # type: ignore[import] 2025-09-07T10:01:28.3673681Z 2025-09-07T10:01:31.1791203Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:01:31.1791951Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:01:31.1792606Z cuda train timm_resnest 2025-09-07T10:01:43.2086217Z pass 2025-09-07T10:01:46.5576516Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:01:46.5578066Z import pynvml # type: ignore[import] 2025-09-07T10:01:49.4039059Z 2025-09-07T10:01:51.9146717Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:01:51.9147197Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:01:51.9147591Z cuda train timm_vision_transformer 2025-09-07T10:02:02.8313410Z pass 2025-09-07T10:02:06.1466198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:02:06.1467676Z import pynvml # type: ignore[import] 2025-09-07T10:02:08.8752816Z 2025-09-07T10:02:22.8668029Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:02:22.8668544Z loading model: 0it [00:13, ?it/s] 2025-09-07T10:02:22.8668979Z cuda train timm_vision_transformer_large 2025-09-07T10:02:22.8907689Z pass_due_to_skip 2025-09-07T10:02:24.7446976Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:02:24.7448491Z import pynvml # type: ignore[import] 2025-09-07T10:02:27.6640799Z 2025-09-07T10:02:30.4678105Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:02:30.4678597Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:02:30.4679014Z cuda train timm_vovnet 2025-09-07T10:02:43.7744520Z pass 2025-09-07T10:02:47.3320908Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:02:47.3324112Z import pynvml # type: ignore[import] 2025-09-07T10:02:50.2436422Z 2025-09-07T10:02:52.6288487Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:02:52.6289262Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:02:52.6289907Z cuda train torch_multimodal_clip 2025-09-07T10:03:04.8982941Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T10:03:04.8985557Z pred = mod(*cloned_inputs) 2025-09-07T10:03:04.8986896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/model.py", line 72, in forward 2025-09-07T10:03:04.8988316Z embeddings_b = self.encoder_b(features_b) 2025-09-07T10:03:04.8989785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/text_encoder.py", line 132, in forward 2025-09-07T10:03:04.8991460Z hidden_state[torch.arange(hidden_state.shape[0]), text.argmax(dim=-1)] 2025-09-07T10:03:04.8992151Z 2025-09-07T10:03:04.8992160Z 2025-09-07T10:03:05.9027692Z W0907 10:03:05.901000 39160 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:03:12.2621655Z pass 2025-09-07T10:03:15.6438181Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:03:15.6439806Z import pynvml # type: ignore[import] 2025-09-07T10:03:18.4471544Z 2025-09-07T10:03:19.3875903Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:03:19.3876555Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:03:19.3877682Z cuda train tts_angular 2025-09-07T10:03:27.3571542Z W0907 10:03:27.356000 39320 site-packages/torch/_logging/_internal.py:1199] [10/0] Profiler function will be ignored 2025-09-07T10:03:27.9500376Z pass 2025-09-07T10:03:30.7106460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:03:30.7108007Z import pynvml # type: ignore[import] 2025-09-07T10:03:33.4349636Z 2025-09-07T10:03:36.5145422Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:03:36.5145838Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:03:36.5146283Z cuda train vgg16 2025-09-07T10:03:47.7230636Z pass 2025-09-07T10:03:50.8569392Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:03:50.8570910Z import pynvml # type: ignore[import] 2025-09-07T10:03:53.7299974Z 2025-09-07T10:04:01.5458682Z loading model: 0it [00:00, ?it/s]skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:01.5460659Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:01.5461647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:01.5462429Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:01.5463156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:01.5463828Z return input[ 2025-09-07T10:04:01.5464041Z 2025-09-07T10:04:01.5464047Z 2025-09-07T10:04:06.4809366Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:06.4810774Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:06.4811784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:06.4812574Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:06.4813296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:06.4813969Z return input[ 2025-09-07T10:04:06.4814198Z 2025-09-07T10:04:06.4814204Z 2025-09-07T10:04:07.8895271Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:07.8896683Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:07.8897761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:07.8898582Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:07.8899315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:07.8899989Z return input[ 2025-09-07T10:04:07.8900206Z 2025-09-07T10:04:07.8900211Z 2025-09-07T10:04:09.0876193Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:09.0877625Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:09.0878618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:09.0879406Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:09.0880092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:09.0880825Z return input[ 2025-09-07T10:04:09.0881055Z 2025-09-07T10:04:09.0881060Z 2025-09-07T10:04:10.1183538Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:10.1184965Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:10.1185944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:10.1186736Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:10.1187423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:10.1188153Z return input[ 2025-09-07T10:04:10.1188367Z 2025-09-07T10:04:10.1188759Z 2025-09-07T10:04:11.1403622Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:11.1405024Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:11.1406033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:11.1406831Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:11.1407551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:11.1408226Z return input[ 2025-09-07T10:04:11.1408439Z 2025-09-07T10:04:11.1408443Z 2025-09-07T10:04:12.3286519Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:12.3287938Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:12.3288927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:12.3289713Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:12.3290412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:12.3291124Z return input[ 2025-09-07T10:04:12.3291336Z 2025-09-07T10:04:12.3291340Z 2025-09-07T10:04:13.0120807Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:13.0122156Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:13.0123401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:13.0124192Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:13.0124912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:13.0125628Z return input[ 2025-09-07T10:04:13.0126126Z 2025-09-07T10:04:13.0126131Z 2025-09-07T10:04:13.5377997Z 2025-09-07T10:04:13.5378373Z loading model: 0it [00:19, ?it/s] 2025-09-07T10:04:13.5379014Z cuda train vision_maskrcnn 2025-09-07T10:04:13.6899838Z W0907 10:04:13.689000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T10:04:13.6901155Z W0907 10:04:13.689000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] function: '_roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:114) 2025-09-07T10:04:13.6902608Z W0907 10:04:13.689000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] last reason: 0/7: tensor 'rois' dtype mismatch. expected Float, actual Double 2025-09-07T10:04:13.6903876Z W0907 10:04:13.689000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T10:04:13.6905292Z W0907 10:04:13.689000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T10:04:21.5425765Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 114, in forward 2025-09-07T10:04:21.5427111Z features = self.backbone(images.tensors) 2025-09-07T10:04:21.5428377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/backbone_utils.py", line 58, in forward 2025-09-07T10:04:21.5429134Z x = self.fpn(x) 2025-09-07T10:04:21.5429844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/feature_pyramid_network.py", line 194, in forward 2025-09-07T10:04:21.5430778Z inner_top_down = F.interpolate(last_inner, size=feat_shape, mode="nearest") 2025-09-07T10:04:21.5431164Z 2025-09-07T10:04:21.5431178Z 2025-09-07T10:04:24.6471812Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T10:04:24.6473040Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T10:04:24.6473981Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T10:04:24.6474923Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T10:04:24.6476074Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T10:04:24.6477026Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T10:04:24.6477920Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T10:04:24.6479476Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T10:04:24.6481332Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T10:04:24.6483243Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T10:04:24.6484806Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T10:04:24.6486683Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T10:04:24.6488089Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T10:04:24.6489512Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T10:04:24.6491029Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T10:04:24.6492499Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T10:04:24.6493991Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T10:04:24.6495489Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T10:04:24.6496938Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T10:04:24.6498198Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T10:04:24.6498953Z W0907 10:04:24.646000 39640 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T10:04:25.7847934Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:25.7849525Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:25.7850551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:25.7851343Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:25.7852013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:25.7852952Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:25.7853940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:25.7854724Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:25.7855447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:25.7856116Z return input[ 2025-09-07T10:04:25.7856327Z 2025-09-07T10:04:25.7856332Z 2025-09-07T10:04:26.4821825Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:26.4823400Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:26.4824462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:26.4825206Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:26.4825910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:26.4827143Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:26.4828116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:26.4828957Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:26.4829642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:26.4830363Z return input[ 2025-09-07T10:04:26.4830577Z 2025-09-07T10:04:26.4830592Z 2025-09-07T10:04:27.0836271Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:27.0837845Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:27.0838890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:27.0839636Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:27.0840340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:27.0841273Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:27.0842248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:27.0843664Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:27.0844344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:27.0845056Z return input[ 2025-09-07T10:04:27.0845272Z 2025-09-07T10:04:27.0845278Z 2025-09-07T10:04:27.6584207Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:27.6585770Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:27.6586794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:27.6587585Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:27.6588259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:27.6589190Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:27.6590170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:27.6590959Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:27.6591684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:27.6592361Z return input[ 2025-09-07T10:04:27.6592586Z 2025-09-07T10:04:27.6592590Z 2025-09-07T10:04:31.0017252Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:31.0019034Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:31.0020065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:31.0020857Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:31.0021902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:31.0022845Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:31.0023826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:31.0024616Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:31.0025330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:31.0026012Z return input[ 2025-09-07T10:04:31.0026226Z 2025-09-07T10:04:31.0026231Z 2025-09-07T10:04:31.6573801Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:31.6575392Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:31.6576414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:31.6577203Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:31.6577980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:31.6578870Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:31.6580198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:31.6580988Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:31.6581702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:31.6582414Z return input[ 2025-09-07T10:04:31.6582587Z 2025-09-07T10:04:31.6582592Z 2025-09-07T10:04:32.8093878Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:32.8095401Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:32.8096426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:32.8097220Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:32.8097999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:32.8098928Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:32.8099875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:32.8100658Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:32.8101379Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:32.8102102Z return input[ 2025-09-07T10:04:32.8102273Z 2025-09-07T10:04:32.8102278Z 2025-09-07T10:04:33.4441069Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T10:04:33.4442677Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:04:33.4443874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T10:04:33.4444659Z return fn(*args[2:], **kwargs) 2025-09-07T10:04:33.4448853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:33.4449950Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:33.4450940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:33.4451687Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:33.4452412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:33.4453144Z return input[ 2025-09-07T10:04:33.4453332Z 2025-09-07T10:04:33.4453337Z 2025-09-07T10:04:33.6342629Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 823, in torch_dynamo_resume_in_forward_at_806 2025-09-07T10:04:33.6344090Z masks_probs = maskrcnn_inference(mask_logits, labels) 2025-09-07T10:04:33.6344982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 79, in maskrcnn_inference 2025-09-07T10:04:33.6345808Z mask_prob = mask_prob[index, labels][:, None] 2025-09-07T10:04:33.6346124Z 2025-09-07T10:04:33.6346129Z 2025-09-07T10:04:34.5528431Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T10:04:34.5530378Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T10:04:34.5530806Z 2025-09-07T10:04:34.5530811Z 2025-09-07T10:04:35.0704208Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T10:04:35.0705773Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T10:04:35.0706207Z 2025-09-07T10:04:35.0706212Z 2025-09-07T10:04:36.4437274Z W0907 10:04:36.442000 39640 site-packages/torch/_logging/_internal.py:1199] [51/0] Profiler function will be ignored 2025-09-07T10:04:38.9618490Z W0907 10:04:38.961000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T10:04:38.9620022Z W0907 10:04:38.961000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] function: 'torch_dynamo_resume_in_roi_align_at_255' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:255) 2025-09-07T10:04:38.9621621Z W0907 10:04:38.961000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] last reason: 30/7: tensor 'rois' requires_grad mismatch. expected requires_grad=1 2025-09-07T10:04:38.9622884Z W0907 10:04:38.961000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T10:04:38.9624272Z W0907 10:04:38.961000 39640 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T10:04:39.9778900Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:39.9780314Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:39.9781294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:39.9782406Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:39.9783133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:39.9783810Z return input[ 2025-09-07T10:04:39.9784026Z 2025-09-07T10:04:39.9784031Z 2025-09-07T10:04:40.5910979Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:40.5912382Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:40.5913365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:40.5914111Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:40.5914805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:40.5915525Z return input[ 2025-09-07T10:04:40.5915743Z 2025-09-07T10:04:40.5915748Z 2025-09-07T10:04:41.2039570Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:41.2040980Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:41.2041926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:41.2043255Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:41.2043979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:41.2044684Z return input[ 2025-09-07T10:04:41.2044856Z 2025-09-07T10:04:41.2044861Z 2025-09-07T10:04:41.8111899Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:41.8113262Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:41.8114243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:41.8115030Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:41.8115770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:41.8116481Z return input[ 2025-09-07T10:04:41.8116656Z 2025-09-07T10:04:41.8116661Z 2025-09-07T10:04:43.4867613Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:43.4868967Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:43.4869959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:43.4870730Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:43.4871456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:43.4872187Z return input[ 2025-09-07T10:04:43.4872361Z 2025-09-07T10:04:43.4872366Z 2025-09-07T10:04:44.0989880Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:44.0991279Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:44.0992615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:44.0993412Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:44.0994092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:44.0994815Z return input[ 2025-09-07T10:04:44.0995031Z 2025-09-07T10:04:44.7085191Z 2025-09-07T10:04:44.7086446Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:04:44.7087858Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:04:44.7088839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:04:44.7089624Z v4 = masked_index(y_high, x_high) 2025-09-07T10:04:44.7090308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:04:44.7091028Z return input[ 2025-09-07T10:04:44.7091246Z 2025-09-07T10:04:44.7091251Z 2025-09-07T10:04:47.0393581Z W0907 10:04:47.038000 39640 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:04:47.0772090Z W0907 10:04:47.076000 39640 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:04:47.1955005Z pass 2025-09-07T10:04:51.0458588Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:04:51.0461218Z import pynvml # type: ignore[import] 2025-09-07T10:04:53.7214580Z 2025-09-07T10:04:56.7776370Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:04:56.7776817Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:04:56.7777192Z cuda train yolov3 2025-09-07T10:05:14.1858299Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T10:05:14.1860613Z pred = mod(*cloned_inputs) 2025-09-07T10:05:14.1861567Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T10:05:14.1862588Z return self.forward_once(x) 2025-09-07T10:05:14.1863543Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T10:05:14.1864575Z x = module(x) 2025-09-07T10:05:14.1864898Z 2025-09-07T10:05:14.1864907Z 2025-09-07T10:05:16.4795548Z W0907 10:05:16.478000 39838 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:05:26.0232835Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T10:05:26.0234944Z pred = mod(*cloned_inputs) 2025-09-07T10:05:26.0235843Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T10:05:26.0236839Z return self.forward_once(x) 2025-09-07T10:05:26.0237761Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T10:05:26.0238696Z x = module(x) 2025-09-07T10:05:26.0239019Z 2025-09-07T10:05:26.0239028Z 2025-09-07T10:05:27.3709381Z pass 2025-09-07T10:05:30.2910306Z accuracy pass_rate=88.24% 2025-09-07T10:05:30.2918568Z calls_captured gmean=0.00x mean=781.059x 2025-09-07T10:05:30.2922635Z unique_graphs gmean=0.00x mean=5.941x 2025-09-07T10:05:30.2927280Z graph_breaks gmean=0.00x mean=8.294x 2025-09-07T10:05:30.2931704Z unique_graph_breaks gmean=0.00x mean=4.765x 2025-09-07T10:05:30.2935816Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T10:05:30.2940318Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T10:05:30.2944525Z cudagraph_skips gmean=0.00x mean=1.059x 2025-09-07T10:05:30.2945656Z compilation_latency mean=23.261 seconds 2025-09-07T10:05:31.2185678Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cppwrapper-true* ]] 2025-09-07T10:05:31.2187250Z + TORCHINDUCTOR_CPP_WRAPPER=1 2025-09-07T10:05:31.2188849Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_accuracy.csv 2025-09-07T10:05:31.8321613Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:05:31.8323716Z import pynvml # type: ignore[import] 2025-09-07T10:05:35.7525808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:05:35.7528094Z import pynvml # type: ignore[import] 2025-09-07T10:05:38.4648828Z 2025-09-07T10:05:39.4952105Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:05:39.4952769Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:05:39.4953368Z cuda train soft_actor_critic 2025-09-07T10:05:58.1293327Z W0907 10:05:58.128000 40041 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:06:00.8939497Z pass 2025-09-07T10:06:03.8566158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:06:03.8567703Z import pynvml # type: ignore[import] 2025-09-07T10:06:06.8534535Z 2025-09-07T10:06:08.4841158Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:06:08.4841849Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:06:08.4842534Z cuda train speech_transformer 2025-09-07T10:07:47.9053076Z pass 2025-09-07T10:07:52.3605603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:07:52.3607768Z import pynvml # type: ignore[import] 2025-09-07T10:07:55.1535450Z 2025-09-07T10:07:56.6138098Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:07:56.6138551Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:07:56.6138966Z cuda train squeezenet1_1 2025-09-07T10:08:27.0421981Z pass 2025-09-07T10:08:30.4113429Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:08:30.4116719Z import pynvml # type: ignore[import] 2025-09-07T10:08:33.1684753Z 2025-09-07T10:08:35.0882438Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:08:35.0882753Z 2025-09-07T10:08:35.2650226Z Loading pipeline components...: 0% 0/6 [00:00 will be ignored 2025-09-07T10:21:18.0131838Z pass 2025-09-07T10:21:24.2855230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:21:24.2857651Z import pynvml # type: ignore[import] 2025-09-07T10:21:26.9970646Z 2025-09-07T10:21:30.7311865Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:21:30.7312356Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:21:30.7312768Z cuda train timm_nfnet 2025-09-07T10:22:56.0898845Z pass 2025-09-07T10:23:00.6959679Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:23:00.6961181Z import pynvml # type: ignore[import] 2025-09-07T10:23:03.5705155Z 2025-09-07T10:23:07.2940057Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:23:07.2940526Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:23:07.2940902Z cuda train timm_regnet 2025-09-07T10:24:29.0197706Z W0907 10:24:29.018000 43801 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:25:34.1532749Z W0907 10:25:34.152000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1540158Z W0907 10:25:34.153000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1547542Z W0907 10:25:34.154000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1554592Z W0907 10:25:34.155000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1561405Z W0907 10:25:34.155000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1569153Z W0907 10:25:34.156000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1576193Z W0907 10:25:34.157000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1583345Z W0907 10:25:34.158000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1590629Z W0907 10:25:34.158000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1597562Z W0907 10:25:34.159000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1604801Z W0907 10:25:34.160000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1611852Z W0907 10:25:34.160000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1627565Z W0907 10:25:34.162000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1634403Z W0907 10:25:34.163000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1641745Z W0907 10:25:34.163000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1649291Z W0907 10:25:34.164000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1656074Z W0907 10:25:34.165000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1663219Z W0907 10:25:34.166000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1670249Z W0907 10:25:34.166000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1677077Z W0907 10:25:34.167000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1684852Z W0907 10:25:34.168000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1691807Z W0907 10:25:34.168000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1699078Z W0907 10:25:34.169000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1710290Z W0907 10:25:34.170000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1717134Z W0907 10:25:34.171000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1725114Z W0907 10:25:34.172000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1732078Z W0907 10:25:34.172000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1739059Z W0907 10:25:34.173000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.1746558Z W0907 10:25:34.174000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2233130Z W0907 10:25:34.222000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2240392Z W0907 10:25:34.223000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2248273Z W0907 10:25:34.224000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2255547Z W0907 10:25:34.225000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2263127Z W0907 10:25:34.226000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2270795Z W0907 10:25:34.226000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2278233Z W0907 10:25:34.227000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2285817Z W0907 10:25:34.228000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2293690Z W0907 10:25:34.229000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2301145Z W0907 10:25:34.229000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2308467Z W0907 10:25:34.230000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2315913Z W0907 10:25:34.231000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2332169Z W0907 10:25:34.232000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2339333Z W0907 10:25:34.233000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2347186Z W0907 10:25:34.234000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2354520Z W0907 10:25:34.235000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2361871Z W0907 10:25:34.235000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2369969Z W0907 10:25:34.236000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2377309Z W0907 10:25:34.237000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2384846Z W0907 10:25:34.238000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2392539Z W0907 10:25:34.238000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2399807Z W0907 10:25:34.239000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2407653Z W0907 10:25:34.240000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2419413Z W0907 10:25:34.241000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2426688Z W0907 10:25:34.242000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2434656Z W0907 10:25:34.243000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2442021Z W0907 10:25:34.243000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2449516Z W0907 10:25:34.244000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.2457081Z W0907 10:25:34.245000 43801 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:25:34.3598742Z pass 2025-09-07T10:25:40.2150380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:25:40.2151803Z import pynvml # type: ignore[import] 2025-09-07T10:25:42.9717702Z 2025-09-07T10:25:45.7550536Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:25:45.7551172Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:25:45.7551731Z cuda train timm_resnest 2025-09-07T10:26:33.0403467Z pass 2025-09-07T10:26:36.5134385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:26:36.5137016Z import pynvml # type: ignore[import] 2025-09-07T10:26:39.2069726Z 2025-09-07T10:26:41.6083810Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:26:41.6084427Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:26:41.6084972Z cuda train timm_vision_transformer 2025-09-07T10:27:28.1914000Z pass 2025-09-07T10:27:31.7749736Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:27:31.7751358Z import pynvml # type: ignore[import] 2025-09-07T10:27:34.5559950Z 2025-09-07T10:27:48.8448478Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:27:48.8448866Z loading model: 0it [00:14, ?it/s] 2025-09-07T10:27:48.8449207Z cuda train timm_vision_transformer_large 2025-09-07T10:27:48.8681539Z pass_due_to_skip 2025-09-07T10:27:50.6541886Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:27:50.6543413Z import pynvml # type: ignore[import] 2025-09-07T10:27:53.2772842Z 2025-09-07T10:27:55.9927521Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:27:55.9927891Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:27:55.9928208Z cuda train timm_vovnet 2025-09-07T10:28:45.4985414Z pass 2025-09-07T10:28:49.4529406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:28:49.4530825Z import pynvml # type: ignore[import] 2025-09-07T10:28:52.3287622Z 2025-09-07T10:28:54.6583971Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:28:54.6584369Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:28:54.6584691Z cuda train torch_multimodal_clip 2025-09-07T10:30:50.1599478Z W0907 10:30:50.159000 45125 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:32:11.1276654Z pass 2025-09-07T10:32:18.3853907Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:32:18.3855999Z import pynvml # type: ignore[import] 2025-09-07T10:32:21.1794894Z 2025-09-07T10:32:22.1069424Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:32:22.1069828Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:32:32.8372340Z cuda train tts_angular 2025-09-07T10:32:32.8373892Z W0907 10:32:32.836000 45493 site-packages/torch/_logging/_internal.py:1199] [10/0] Profiler function will be ignored 2025-09-07T10:32:37.2669621Z pass 2025-09-07T10:32:40.1522095Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:32:40.1524536Z import pynvml # type: ignore[import] 2025-09-07T10:32:42.9308675Z 2025-09-07T10:32:46.0571004Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:32:46.0571633Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:32:46.0572163Z cuda train vgg16 2025-09-07T10:33:09.7348027Z pass 2025-09-07T10:33:12.7558279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:33:12.7560672Z import pynvml # type: ignore[import] 2025-09-07T10:33:15.4420524Z 2025-09-07T10:33:34.4937403Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:33:34.4938027Z loading model: 0it [00:19, ?it/s] 2025-09-07T10:33:34.4939141Z cuda train vision_maskrcnn 2025-09-07T10:33:34.4939933Z Traceback (most recent call last): 2025-09-07T10:33:34.4940900Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1997, in validate_model 2025-09-07T10:33:34.4941950Z self.model_iter_fn(model, example_inputs) 2025-09-07T10:33:34.4943075Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in forward_and_backward_pass 2025-09-07T10:33:34.4944191Z pred = mod(*cloned_inputs) 2025-09-07T10:33:34.4945344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T10:33:34.4946561Z return self._call_impl(*args, **kwargs) 2025-09-07T10:33:34.4947680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T10:33:34.4948811Z return forward_call(*args, **kwargs) 2025-09-07T10:33:34.4950118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in forward 2025-09-07T10:33:34.4951812Z detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T10:33:34.4953423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T10:33:34.4954652Z return self._call_impl(*args, **kwargs) 2025-09-07T10:33:34.4955753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T10:33:34.4957334Z return forward_call(*args, **kwargs) 2025-09-07T10:33:34.4958618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T10:33:34.4960019Z box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T10:33:34.4961321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T10:33:34.4962492Z return self._call_impl(*args, **kwargs) 2025-09-07T10:33:34.4963888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T10:33:34.4965025Z return forward_call(*args, **kwargs) 2025-09-07T10:33:34.4966071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 314, in forward 2025-09-07T10:33:34.4978520Z return _multiscale_roi_align( 2025-09-07T10:33:34.4979925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 204, in _multiscale_roi_align 2025-09-07T10:33:34.4981172Z result_idx_in_level = roi_align( 2025-09-07T10:33:34.4982238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in roi_align 2025-09-07T10:33:34.4983777Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T10:33:34.4985285Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 27, in compile_hook 2025-09-07T10:33:34.4986398Z return compiled_fn(*args, **kwargs) 2025-09-07T10:33:34.4987515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 832, in compile_wrapper 2025-09-07T10:33:34.4988663Z return fn(*args, **kwargs) 2025-09-07T10:33:34.4989778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1875, in __call__ 2025-09-07T10:33:34.4990928Z result = self._torchdynamo_orig_backend( 2025-09-07T10:33:34.4992052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1624, in __call__ 2025-09-07T10:33:34.4993194Z result = self._inner_convert( 2025-09-07T10:33:34.4994327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 688, in __call__ 2025-09-07T10:33:34.4995754Z result = _compile( 2025-09-07T10:33:34.4996754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1494, in _compile 2025-09-07T10:33:34.4997959Z raise InternalTorchDynamoError( 2025-09-07T10:33:34.4999131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1433, in _compile 2025-09-07T10:33:34.5000491Z guarded_code, tracer_output = compile_inner(code, one_graph, hooks) 2025-09-07T10:33:34.5001848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_utils_internal.py", line 92, in wrapper_function 2025-09-07T10:33:34.5003179Z return function(*args, **kwargs) 2025-09-07T10:33:34.5004276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1117, in compile_inner 2025-09-07T10:33:34.5005475Z return _compile_inner(code, one_graph, hooks) 2025-09-07T10:33:34.5006803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1151, in _compile_inner 2025-09-07T10:33:34.5008011Z dynamo_output = compile_frame( 2025-09-07T10:33:34.5009159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1032, in compile_frame 2025-09-07T10:33:34.5010484Z bytecode, tracer_output = transform_code_object(code, transform) 2025-09-07T10:33:34.5012079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1592, in transform_code_object 2025-09-07T10:33:34.5013941Z tracer_output = transformations(instructions, code_options) 2025-09-07T10:33:34.5015245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1004, in transform 2025-09-07T10:33:34.5016459Z tracer_output = trace_frame( 2025-09-07T10:33:34.5017624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 333, in _fn 2025-09-07T10:33:34.5018760Z torch.cuda.set_rng_state(cuda_rng_state) 2025-09-07T10:33:34.5019855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/random.py", line 79, in set_rng_state 2025-09-07T10:33:34.5020896Z _lazy_call(cb) 2025-09-07T10:33:34.5021767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py", line 341, in _lazy_call 2025-09-07T10:33:34.5022802Z callable() 2025-09-07T10:33:34.5023661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/random.py", line 77, in cb 2025-09-07T10:33:34.5024720Z default_generator.set_state(new_state) 2025-09-07T10:33:34.5025806Z torch._dynamo.exc.InternalTorchDynamoError: AcceleratorError: CUDA error: device-side assert triggered 2025-09-07T10:33:34.5027554Z Search for `cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. 2025-09-07T10:33:34.5029419Z CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. 2025-09-07T10:33:34.5030718Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1 2025-09-07T10:33:34.5031621Z Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. 2025-09-07T10:33:34.5032241Z 2025-09-07T10:33:34.5032250Z 2025-09-07T10:33:34.5032258Z 2025-09-07T10:33:34.5032660Z The above exception was the direct cause of the following exception: 2025-09-07T10:33:34.5033303Z 2025-09-07T10:33:34.5033499Z Traceback (most recent call last): 2025-09-07T10:33:34.5034413Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4172, in run 2025-09-07T10:33:34.5035362Z ) = runner.load_model( 2025-09-07T10:33:34.5036288Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 401, in load_model 2025-09-07T10:33:34.5037348Z self.validate_model(model, example_inputs) 2025-09-07T10:33:34.5038380Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1999, in validate_model 2025-09-07T10:33:34.5039752Z raise RuntimeError("Eager run failed") from e 2025-09-07T10:33:34.5040369Z RuntimeError: Eager run failed 2025-09-07T10:33:34.5040708Z 2025-09-07T10:33:34.5040868Z eager_fail_to_run 2025-09-07T10:33:36.5273460Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [32,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5276620Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [33,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5279670Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [34,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5282736Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [35,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5286088Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [36,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5289097Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [37,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5292091Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [38,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5295651Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [39,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5298868Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [40,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5301935Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [41,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5304939Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [42,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5307888Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [43,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5310962Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [44,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5313991Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [45,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5317049Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [46,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5320123Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [47,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5323243Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [48,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5326669Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [49,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5329684Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [50,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5332720Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [51,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5335859Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [52,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5338922Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [53,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5341871Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [54,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5344909Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [55,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5347888Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [56,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5351444Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [57,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5354487Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [58,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5357527Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [59,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5360525Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [60,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5363699Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [61,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5366783Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [62,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5369860Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25245,0,0], thread: [63,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5372891Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [32,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5375778Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [33,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5378975Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [34,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5382000Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [35,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5385370Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [36,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5388453Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [37,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5391382Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [38,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5394463Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [39,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5397412Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [40,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5400493Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [41,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5403790Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [42,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5406772Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [43,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5410073Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [44,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5413118Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [45,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5416108Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [46,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5419287Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [47,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5422343Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [48,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5425262Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [49,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5428359Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [50,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5431361Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [51,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5434385Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [52,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5437496Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [53,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5440777Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [54,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5443990Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [55,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5446849Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [56,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5449674Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [57,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5452648Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [58,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5455791Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [59,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5459046Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [60,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5462157Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [61,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5465656Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [62,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:36.5468817Z /tmp/torchinductor_jenkins/6g/c6guqtklehxpet7g6rjyuqopsdf63ebmf5rdj5zjiecfcchcjttk.py:56: unknown: block: [25271,0,0], thread: [63,0,0] Assertion `index out of bounds: 0 <= tmp10 < 1` failed. 2025-09-07T10:33:37.5683668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:33:37.5685165Z import pynvml # type: ignore[import] 2025-09-07T10:33:40.3759238Z 2025-09-07T10:33:43.6221512Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:33:43.6221915Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:33:43.6222274Z cuda train yolov3 2025-09-07T10:35:58.8587888Z W0907 10:35:58.857000 46315 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:36:57.1465694Z pass 2025-09-07T10:37:04.0563655Z accuracy pass_rate=83.33% 2025-09-07T10:37:04.0570762Z calls_captured gmean=0.00x mean=647.778x 2025-09-07T10:37:04.0575248Z unique_graphs gmean=0.00x mean=2.778x 2025-09-07T10:37:04.0579805Z graph_breaks gmean=0.00x mean=6.056x 2025-09-07T10:37:04.0584967Z unique_graph_breaks gmean=0.00x mean=4.389x 2025-09-07T10:37:04.0589392Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T10:37:04.0593848Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T10:37:04.0598013Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T10:37:04.0599409Z compilation_latency mean=91.528 seconds 2025-09-07T10:37:04.9661898Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing_cudagraphs-true* ]] 2025-09-07T10:37:04.9663421Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T10:37:04.9665293Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freeze_autotune_cudagraphs-true* ]] 2025-09-07T10:37:04.9666773Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T10:37:04.9668155Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *aotinductor-true* ]] 2025-09-07T10:37:04.9669551Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T10:37:04.9670932Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *maxautotune-true* ]] 2025-09-07T10:37:04.9672334Z + TORCHINDUCTOR_MAX_AUTOTUNE=1 2025-09-07T10:37:04.9673746Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_accuracy.csv 2025-09-07T10:37:05.6750658Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:37:05.6753338Z import pynvml # type: ignore[import] 2025-09-07T10:37:09.4782881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:37:09.4789834Z import pynvml # type: ignore[import] 2025-09-07T10:37:12.3691554Z 2025-09-07T10:37:13.3979539Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:37:13.3979921Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:37:13.3980292Z cuda train soft_actor_critic 2025-09-07T10:37:21.3787762Z Autotune Choices Stats: 2025-09-07T10:37:21.3790381Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_1", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:37:21.3797543Z AUTOTUNE mm(256x3, 3x1024) 2025-09-07T10:37:21.3798558Z strides: [3, 1], [1, 3] 2025-09-07T10:37:21.3799112Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:21.3800440Z triton_mm_1 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:21.3802523Z triton_mm_2 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:21.3805081Z triton_mm_3 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:21.3807148Z triton_mm_4 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:21.3809159Z triton_mm_5 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:21.3811274Z triton_mm_6 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:21.3814010Z triton_mm_7 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:21.3816253Z triton_mm_10 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:21.3818438Z triton_mm_0 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:37:21.3820475Z triton_mm_8 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:21.3822332Z SingleProcess AUTOTUNE benchmarking takes 0.2135 seconds and 0.0168 seconds precompiling for 17 choices 2025-09-07T10:37:22.0994792Z Autotune Choices Stats: 2025-09-07T10:37:22.0996358Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_24", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:37:22.1003398Z AUTOTUNE mm(256x1024, 1024x1024) 2025-09-07T10:37:22.1003707Z strides: [1024, 1], [1, 1024] 2025-09-07T10:37:22.1004014Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:22.1004342Z mm 0.0143 ms 100.0% 2025-09-07T10:37:22.1005040Z triton_mm_24 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:22.1006568Z triton_mm_23 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:22.1007764Z triton_mm_20 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:22.1008949Z triton_mm_17 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:22.1010136Z triton_mm_27 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:22.1011317Z triton_mm_18 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:22.1012474Z triton_mm_19 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:22.1013655Z triton_mm_26 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:22.1014833Z triton_mm_29 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:22.1015853Z SingleProcess AUTOTUNE benchmarking takes 0.2834 seconds and 0.0799 seconds precompiling for 19 choices 2025-09-07T10:37:23.0090877Z Autotune Choices Stats: 2025-09-07T10:37:23.0092127Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_44", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:37:23.0100728Z AUTOTUNE addmm(256x2, 256x1024, 1024x2) 2025-09-07T10:37:23.0101086Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T10:37:23.0101792Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:37:23.0102618Z triton_mm_44 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:23.0103823Z triton_mm_38 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:23.0105014Z triton_mm_36 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:37:23.0105773Z bias_addmm 0.0133 ms 76.9% 2025-09-07T10:37:23.0106494Z triton_mm_37 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:23.0107672Z triton_mm_41 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:23.0108829Z triton_mm_49 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:23.0110010Z triton_mm_35 0.0143 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:37:23.0111352Z triton_mm_43 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:23.0112543Z triton_mm_48 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:23.0113575Z SingleProcess AUTOTUNE benchmarking takes 0.2606 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:37:26.9378112Z Autotune Choices Stats: 2025-09-07T10:37:26.9379634Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_mm_110", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:37:26.9387860Z AUTOTUNE mm(1024x256, 256x1024) 2025-09-07T10:37:26.9388171Z strides: [1, 1024], [1024, 1] 2025-09-07T10:37:26.9388488Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:26.9388799Z mm 0.0133 ms 100.0% 2025-09-07T10:37:26.9389523Z triton_mm_110 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:26.9390734Z triton_mm_112 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:26.9391928Z triton_mm_113 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:26.9393130Z triton_mm_117 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:26.9394326Z triton_mm_108 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:26.9395824Z triton_mm_111 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:26.9397026Z triton_mm_114 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:26.9398217Z triton_mm_116 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:26.9399400Z triton_mm_118 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:26.9400445Z SingleProcess AUTOTUNE benchmarking takes 0.2681 seconds and 1.1994 seconds precompiling for 19 choices 2025-09-07T10:37:27.3614099Z Autotune Choices Stats: 2025-09-07T10:37:27.3615358Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_50", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:37:27.3623715Z AUTOTUNE mm(256x2, 2x1024) 2025-09-07T10:37:27.3624009Z strides: [2, 1], [1024, 1] 2025-09-07T10:37:27.3624309Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:27.3625080Z triton_mm_50 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:37:27.3626619Z triton_mm_51 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:27.3627817Z triton_mm_52 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:27.3629002Z triton_mm_53 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:27.3630188Z triton_mm_54 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:27.3631375Z triton_mm_55 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:27.3632575Z triton_mm_56 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:27.3633761Z triton_mm_57 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:27.3634962Z triton_mm_58 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:27.3636158Z triton_mm_59 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:27.3637183Z SingleProcess AUTOTUNE benchmarking takes 0.2023 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:37:27.8567047Z Autotune Choices Stats: 2025-09-07T10:37:27.8568561Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_91", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:37:27.8578229Z AUTOTUNE mm(256x1024, 1024x1024) 2025-09-07T10:37:27.8578540Z strides: [1024, 1], [1024, 1] 2025-09-07T10:37:27.8578841Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:27.8579136Z mm 0.0133 ms 100.0% 2025-09-07T10:37:27.8579834Z triton_mm_91 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:27.8581019Z triton_mm_90 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:27.8582214Z triton_mm_87 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:27.8583404Z triton_mm_94 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:27.8584590Z triton_mm_84 0.0184 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:27.8585751Z triton_mm_96 0.0184 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:27.8586927Z triton_mm_85 0.0195 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:27.8588261Z triton_mm_93 0.0195 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:27.8589441Z triton_mm_86 0.0205 ms 65.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:27.8590465Z SingleProcess AUTOTUNE benchmarking takes 0.2583 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:37:28.0880540Z Autotune Choices Stats: 2025-09-07T10:37:28.0881773Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_120", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:37:28.0891885Z AUTOTUNE mm(1024x256, 256x3) 2025-09-07T10:37:28.0892240Z strides: [1, 1024], [3, 1] 2025-09-07T10:37:28.0892539Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:28.0893328Z triton_mm_120 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:37:28.0894552Z triton_mm_123 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:28.0895750Z triton_mm_129 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:28.0896934Z triton_mm_126 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:28.0898178Z triton_mm_134 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:28.0899367Z triton_mm_122 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:28.0900855Z triton_mm_133 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:28.0901605Z mm 0.0123 ms 75.0% 2025-09-07T10:37:28.0902297Z triton_mm_121 0.0123 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:37:28.0903472Z triton_mm_127 0.0123 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:28.0904507Z SingleProcess AUTOTUNE benchmarking takes 0.2105 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:37:28.5198619Z Autotune Choices Stats: 2025-09-07T10:37:28.5199882Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_67", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:37:28.5209898Z AUTOTUNE mm(2x256, 256x1024) 2025-09-07T10:37:28.5210210Z strides: [1, 2], [1024, 1] 2025-09-07T10:37:28.5210504Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:28.5211257Z triton_mm_67 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:37:28.5212813Z triton_mm_68 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:28.5214002Z triton_mm_69 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:37:28.5215205Z triton_mm_70 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:37:28.5216388Z triton_mm_73 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:28.5217636Z triton_mm_74 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:28.5218843Z triton_mm_77 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:28.5220053Z triton_mm_78 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:28.5221248Z triton_mm_80 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:28.5222431Z triton_mm_82 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:28.5223455Z SingleProcess AUTOTUNE benchmarking takes 0.2203 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:37:29.9198680Z W0907 10:37:29.918000 46747 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:37:31.3158477Z pass 2025-09-07T10:37:34.5041601Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:37:34.5043528Z import pynvml # type: ignore[import] 2025-09-07T10:37:37.3580285Z 2025-09-07T10:37:39.3769497Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:37:39.3769895Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:37:39.3770236Z cuda train speech_transformer 2025-09-07T10:37:53.0440735Z W0907 10:37:53.043000 48837 site-packages/torch/_inductor/utils.py:2298] [9/0_1] DeviceCopy in input program 2025-09-07T10:37:57.7874001Z Autotune Choices Stats: 2025-09-07T10:37:57.7875528Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.02969600073993206, "best_triton_pos": 1, "best_triton_time": 0.03174399957060814, "best_triton_kernel": "triton_mm_138", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:37:57.7886149Z AUTOTUNE mm(2040x512, 512x2048) 2025-09-07T10:37:57.7886470Z strides: [512, 1], [1, 512] 2025-09-07T10:37:57.7886771Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:57.7887084Z mm 0.0297 ms 100.0% 2025-09-07T10:37:57.7887792Z triton_mm_138 0.0317 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:57.7888989Z triton_mm_139 0.0317 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:57.7890498Z triton_mm_136 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:57.7891691Z triton_mm_143 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:57.7892892Z triton_mm_142 0.0338 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:57.7894090Z triton_mm_137 0.0358 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:57.7895274Z triton_mm_140 0.0358 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:57.7896468Z triton_mm_144 0.0369 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:57.7897737Z triton_mm_133 0.0410 ms 72.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:57.7898746Z SingleProcess AUTOTUNE benchmarking takes 0.3263 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T10:37:59.4408932Z Autotune Choices Stats: 2025-09-07T10:37:59.4410471Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.015359999611973763, "best_triton_kernel": "triton_mm_9", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:37:59.4422305Z AUTOTUNE addmm(2040x512, 2040x320, 320x512) 2025-09-07T10:37:59.4422695Z strides: [0, 1], [320, 1], [1, 320] 2025-09-07T10:37:59.4423051Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:37:59.4423417Z bias_addmm 0.0143 ms 100.0% 2025-09-07T10:37:59.4424566Z triton_mm_9 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.4425772Z triton_mm_11 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.4426962Z triton_mm_12 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.4428152Z triton_mm_16 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.4429348Z triton_mm_17 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:59.4430520Z triton_mm_7 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:59.4431686Z triton_mm_10 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:59.4432864Z triton_mm_13 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:59.4434211Z triton_mm_15 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.4435244Z SingleProcess AUTOTUNE benchmarking takes 0.2910 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:37:59.7029874Z Autotune Choices Stats: 2025-09-07T10:37:59.7031346Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_34", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:37:59.7042621Z AUTOTUNE mm(2040x512, 512x512) 2025-09-07T10:37:59.7043063Z strides: [512, 1], [1, 512] 2025-09-07T10:37:59.7043362Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:59.7043657Z mm 0.0164 ms 100.0% 2025-09-07T10:37:59.7044379Z triton_mm_34 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.7045590Z triton_mm_35 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:37:59.7046784Z triton_mm_29 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.7047966Z triton_mm_25 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:59.7049143Z triton_mm_27 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.7050316Z triton_mm_28 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:59.7051493Z triton_mm_30 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.7052931Z triton_mm_31 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:59.7054123Z triton_mm_33 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.7055154Z SingleProcess AUTOTUNE benchmarking takes 0.2611 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:37:59.9752985Z Autotune Choices Stats: 2025-09-07T10:37:59.9754198Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_80", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T10:37:59.9767475Z AUTOTUNE bmm(80x204x64, 80x64x204) 2025-09-07T10:37:59.9767826Z strides: [13056, 64, 1], [13056, 1, 64] 2025-09-07T10:37:59.9768155Z dtypes: torch.float16, torch.float16 2025-09-07T10:37:59.9768906Z triton_bmm_80 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:59.9770098Z triton_bmm_79 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:37:59.9771519Z triton_bmm_81 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.9772709Z triton_bmm_83 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.9773906Z triton_bmm_84 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:37:59.9775097Z triton_bmm_88 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.9776280Z triton_bmm_76 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:37:59.9777499Z triton_bmm_78 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:37:59.9778687Z triton_bmm_82 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:37:59.9779880Z triton_bmm_85 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:37:59.9780912Z SingleProcess AUTOTUNE benchmarking takes 0.2699 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:38:00.2388568Z Autotune Choices Stats: 2025-09-07T10:38:00.2389724Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_97", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T10:38:00.2403500Z AUTOTUNE bmm(80x204x204, 80x204x64) 2025-09-07T10:38:00.2403843Z strides: [41664, 204, 1], [13056, 64, 1] 2025-09-07T10:38:00.2404179Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:00.2405188Z triton_bmm_97 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:00.2406421Z triton_bmm_100 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.2407644Z triton_bmm_104 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.2408837Z triton_bmm_98 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:00.2410037Z triton_bmm_101 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:00.2411243Z triton_bmm_102 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.2412458Z triton_bmm_105 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:00.2413676Z triton_bmm_107 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.2415020Z triton_bmm_93 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:00.2416214Z triton_bmm_96 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:00.2417287Z SingleProcess AUTOTUNE benchmarking takes 0.2630 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:00.5689644Z Autotune Choices Stats: 2025-09-07T10:38:00.5691112Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.035840000957250595, "best_triton_pos": 1, "best_triton_time": 0.04095999896526337, "best_triton_kernel": "triton_mm_156", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:38:00.5705169Z AUTOTUNE mm(2040x2048, 2048x512) 2025-09-07T10:38:00.5705486Z strides: [2048, 1], [1, 2048] 2025-09-07T10:38:00.5705789Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:00.5706107Z mm 0.0358 ms 100.0% 2025-09-07T10:38:00.5706807Z triton_mm_156 0.0410 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.5708025Z triton_mm_161 0.0430 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.5709212Z triton_mm_152 0.0440 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:00.5710392Z triton_mm_153 0.0451 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:00.5711585Z triton_mm_157 0.0461 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.5712781Z triton_mm_162 0.0461 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:00.5714221Z triton_mm_154 0.0481 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.5715409Z triton_mm_155 0.0512 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:00.5716612Z triton_mm_160 0.0512 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:00.5717665Z SingleProcess AUTOTUNE benchmarking takes 0.3284 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:00.6241962Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6242338Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6242672Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6243222Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6243581Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6243917Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6244239Z cudagraph partition due to non gpu ops 2025-09-07T10:38:00.6244590Z cudagraph partition due to DeviceCopy ops 2025-09-07T10:38:00.6710881Z cudagraph partition into 2 partitions 2025-09-07T10:38:23.9581146Z W0907 10:38:23.957000 48837 site-packages/torch/_inductor/utils.py:2298] [15/0_1] DeviceCopy in input program 2025-09-07T10:38:30.1848228Z Autotune Choices Stats: 2025-09-07T10:38:30.1851395Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_mm_1097", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:38:30.1862488Z AUTOTUNE mm(220x512, 512x2048) 2025-09-07T10:38:30.1863063Z strides: [512, 1], [1, 512] 2025-09-07T10:38:30.1863559Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:30.1864101Z mm 0.0133 ms 100.0% 2025-09-07T10:38:30.1865325Z triton_mm_1097 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:30.1867513Z triton_mm_1093 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:30.1869652Z triton_mm_1096 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:30.1871740Z triton_mm_1099 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:30.1873873Z triton_mm_1087 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:30.1875967Z triton_mm_1094 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:30.1878012Z triton_mm_1095 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:30.1880150Z triton_mm_1098 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:30.1882715Z triton_mm_1102 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:30.1884943Z SingleProcess AUTOTUNE benchmarking takes 0.2550 seconds and 0.0006 seconds precompiling for 19 choices 2025-09-07T10:38:31.5244833Z Autotune Choices Stats: 2025-09-07T10:38:31.5247031Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_892", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:38:31.5260761Z AUTOTUNE mm(220x512, 512x512) 2025-09-07T10:38:31.5261307Z strides: [512, 1], [1, 512] 2025-09-07T10:38:31.5261792Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:31.5263119Z triton_mm_892 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.5264495Z mm 0.0113 ms 90.9% 2025-09-07T10:38:31.5265711Z triton_mm_890 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:31.5267791Z triton_mm_891 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:31.5269909Z triton_mm_896 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.5272420Z triton_mm_889 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.5274435Z triton_mm_895 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:31.5276556Z triton_mm_898 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:31.5278672Z triton_mm_899 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:31.5280741Z triton_mm_901 0.0143 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:31.5282536Z SingleProcess AUTOTUNE benchmarking takes 0.2502 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:31.6685182Z Autotune Choices Stats: 2025-09-07T10:38:31.6687191Z {"num_choices": 11, "num_triton_choices": 10, "best_kernel": "triton_bmm_943", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:38:31.6701164Z AUTOTUNE bmm(80x22x64, 80x64x22) 2025-09-07T10:38:31.6701517Z strides: [1408, 64, 1], [1408, 1, 64] 2025-09-07T10:38:31.6701829Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:31.6702612Z triton_bmm_943 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.6703836Z triton_bmm_944 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.6705048Z triton_bmm_945 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.6706604Z triton_bmm_946 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.6707824Z triton_bmm_947 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.6709017Z triton_bmm_948 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:31.6710205Z triton_bmm_949 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:31.6711403Z triton_bmm_950 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:38:31.6712601Z triton_bmm_951 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:38:31.6713353Z bmm 0.0092 ms 88.9% 2025-09-07T10:38:31.6713891Z SingleProcess AUTOTUNE benchmarking takes 0.1424 seconds and 0.0002 seconds precompiling for 11 choices 2025-09-07T10:38:31.8355824Z Autotune Choices Stats: 2025-09-07T10:38:31.8357861Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_bmm_963", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:38:31.8372037Z AUTOTUNE bmm(80x22x22, 80x22x64) 2025-09-07T10:38:31.8372552Z strides: [484, 22, 1], [1408, 64, 1] 2025-09-07T10:38:31.8373101Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:31.8374477Z triton_bmm_963 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:38:31.8376590Z triton_bmm_952 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:38:31.8378810Z triton_bmm_953 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.8380886Z triton_bmm_954 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:31.8382979Z triton_bmm_955 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.8385046Z triton_bmm_956 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.8387183Z triton_bmm_957 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:31.8389349Z triton_bmm_958 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:31.8391473Z triton_bmm_959 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:31.8393585Z triton_bmm_960 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:31.8395726Z SingleProcess AUTOTUNE benchmarking takes 0.1666 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T10:38:32.0663745Z Autotune Choices Stats: 2025-09-07T10:38:32.0665729Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_1036", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:38:32.0680233Z AUTOTUNE bmm(80x22x64, 80x64x204) 2025-09-07T10:38:32.0680768Z strides: [1408, 64, 1], [13056, 1, 64] 2025-09-07T10:38:32.0681268Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:32.0682642Z triton_bmm_1036 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:38:32.0685022Z triton_bmm_1037 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.0687150Z triton_bmm_1038 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.0689233Z triton_bmm_1039 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.0691705Z triton_bmm_1040 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.0693835Z triton_bmm_1041 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.0695982Z triton_bmm_1042 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.0698203Z triton_bmm_1043 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:32.0700341Z triton_bmm_1044 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.0702477Z triton_bmm_1045 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:32.0704205Z SingleProcess AUTOTUNE benchmarking takes 0.2270 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:38:32.2723592Z Autotune Choices Stats: 2025-09-07T10:38:32.2725518Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_1055", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:38:32.2740754Z AUTOTUNE bmm(80x22x204, 80x204x64) 2025-09-07T10:38:32.2741271Z strides: [4544, 204, 1], [13056, 64, 1] 2025-09-07T10:38:32.2741818Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:32.2743251Z triton_bmm_1055 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.2745360Z triton_bmm_1060 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:32.2747878Z triton_bmm_1061 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.2749985Z triton_bmm_1062 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:32.2752098Z triton_bmm_1063 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:32.2754328Z triton_bmm_1064 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:32.2756508Z triton_bmm_1065 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:38:32.2758637Z triton_bmm_1067 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.2760783Z triton_bmm_1054 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.2763126Z triton_bmm_1056 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.2765283Z SingleProcess AUTOTUNE benchmarking takes 0.2055 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T10:38:32.5691520Z Autotune Choices Stats: 2025-09-07T10:38:32.5693491Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1108", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:38:32.5708750Z AUTOTUNE mm(220x2048, 2048x512) 2025-09-07T10:38:32.5709225Z strides: [2048, 1], [1, 2048] 2025-09-07T10:38:32.5709692Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:32.5711020Z triton_mm_1108 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.5712350Z mm 0.0174 ms 88.2% 2025-09-07T10:38:32.5713599Z triton_mm_1112 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.5715716Z triton_mm_1105 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.5717828Z triton_mm_1106 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.5719936Z triton_mm_1107 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.5722031Z triton_mm_1111 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:32.5724463Z triton_mm_1115 0.0287 ms 53.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:32.5726591Z triton_mm_1114 0.0317 ms 48.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:32.5729056Z triton_mm_1117 0.0328 ms 46.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:32.5730898Z SingleProcess AUTOTUNE benchmarking takes 0.2957 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:32.8665931Z Autotune Choices Stats: 2025-09-07T10:38:32.8667950Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2300", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:38:32.8684322Z AUTOTUNE mm(220x512, 512x1014) 2025-09-07T10:38:32.8684855Z strides: [512, 1], [1, 512] 2025-09-07T10:38:32.8685345Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:32.8686667Z triton_mm_2300 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.8688769Z triton_mm_2299 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:32.8690894Z triton_mm_2293 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:32.8693380Z triton_mm_2303 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:32.8695501Z triton_mm_2294 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.8697697Z triton_mm_2295 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:32.8699791Z triton_mm_2296 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:32.8701804Z triton_mm_2302 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:32.8703948Z triton_mm_2305 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:32.8705280Z mm 0.0164 ms 68.8% 2025-09-07T10:38:32.8706207Z SingleProcess AUTOTUNE benchmarking takes 0.2511 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:32.9070533Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9070927Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9071261Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9071607Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9071948Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9072286Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9072618Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9072952Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9073289Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9073630Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9073949Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9074284Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9074617Z cudagraph partition due to non gpu ops 2025-09-07T10:38:32.9074964Z cudagraph partition due to DeviceCopy ops 2025-09-07T10:38:32.9744027Z cudagraph partition into 2 partitions 2025-09-07T10:38:37.3819385Z W0907 10:38:37.381000 48837 site-packages/torch/_inductor/utils.py:2298] [15/0_1] DeviceCopy in input program 2025-09-07T10:38:48.3734395Z Autotune Choices Stats: 2025-09-07T10:38:48.3735632Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2373", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T10:38:48.3753351Z AUTOTUNE mm(512x220, 220x2048) 2025-09-07T10:38:48.3753668Z strides: [1, 512], [2048, 1] 2025-09-07T10:38:48.3753997Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:48.3754779Z triton_mm_2373 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.3755998Z triton_mm_2374 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:48.3757209Z triton_mm_2375 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.3758412Z triton_mm_2376 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.3759621Z triton_mm_2379 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.3761182Z triton_mm_2380 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.3762415Z triton_mm_2381 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:48.3763455Z mm 0.0143 ms 85.7% 2025-09-07T10:38:48.3764143Z triton_mm_2370 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:48.3765345Z triton_mm_2371 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:48.3766388Z SingleProcess AUTOTUNE benchmarking takes 0.2452 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:38:48.6822524Z Autotune Choices Stats: 2025-09-07T10:38:48.6823797Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2412", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T10:38:48.6842742Z AUTOTUNE mm(2048x220, 220x512) 2025-09-07T10:38:48.6843147Z strides: [1, 2048], [512, 1] 2025-09-07T10:38:48.6843454Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:48.6844254Z triton_mm_2412 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.6845024Z mm 0.0133 ms 92.3% 2025-09-07T10:38:48.6845732Z triton_mm_2409 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.6846956Z triton_mm_2413 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:48.6848544Z triton_mm_2415 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.6849776Z triton_mm_2416 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:48.6850995Z triton_mm_2417 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:48.6852212Z triton_mm_2406 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:48.6853410Z triton_mm_2407 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:48.6854619Z triton_mm_2410 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:48.6855653Z SingleProcess AUTOTUNE benchmarking takes 0.2481 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:49.1066362Z Autotune Choices Stats: 2025-09-07T10:38:49.1067851Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_4409", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:38:49.1087971Z AUTOTUNE mm(2040x512, 512x512) 2025-09-07T10:38:49.1088290Z strides: [512, 1], [512, 1] 2025-09-07T10:38:49.1088584Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:49.1088882Z mm 0.0164 ms 100.0% 2025-09-07T10:38:49.1089612Z triton_mm_4409 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:49.1090856Z triton_mm_4414 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:49.1092067Z triton_mm_4415 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:49.1093281Z triton_mm_4410 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:49.1094483Z triton_mm_4405 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:49.1095680Z triton_mm_4407 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:49.1096877Z triton_mm_4413 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:49.1098146Z triton_mm_4408 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:49.1099352Z triton_mm_4411 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:49.1100387Z SingleProcess AUTOTUNE benchmarking takes 0.2564 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:57.9042100Z Autotune Choices Stats: 2025-09-07T10:38:57.9043568Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2320", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:38:57.9061307Z AUTOTUNE mm(1014x220, 220x512) 2025-09-07T10:38:57.9061624Z strides: [1, 1014], [512, 1] 2025-09-07T10:38:57.9061927Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:57.9062730Z triton_mm_2320 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:57.9063937Z triton_mm_2321 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:57.9065163Z triton_mm_2319 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:57.9066377Z triton_mm_2322 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:57.9067585Z triton_mm_2323 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:57.9069003Z triton_mm_2312 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:57.9070201Z triton_mm_2317 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:57.9071420Z triton_mm_2325 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:57.9072169Z mm 0.0143 ms 78.6% 2025-09-07T10:38:57.9072870Z triton_mm_2316 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:57.9073906Z SingleProcess AUTOTUNE benchmarking takes 0.2454 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:38:58.3833873Z Autotune Choices Stats: 2025-09-07T10:38:58.3835424Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_mm_2357", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:38:58.3856467Z AUTOTUNE mm(220x512, 512x2048) 2025-09-07T10:38:58.3856768Z strides: [512, 1], [2048, 1] 2025-09-07T10:38:58.3857083Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:58.3857446Z mm 0.0133 ms 100.0% 2025-09-07T10:38:58.3858160Z triton_mm_2357 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.3859371Z triton_mm_2353 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:58.3860573Z triton_mm_2356 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:58.3862124Z triton_mm_2359 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:58.3863325Z triton_mm_2347 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:58.3864524Z triton_mm_2354 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:58.3865727Z triton_mm_2355 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.3866923Z triton_mm_2358 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.3868123Z triton_mm_2362 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.3869152Z SingleProcess AUTOTUNE benchmarking takes 0.2499 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:58.9321050Z Autotune Choices Stats: 2025-09-07T10:38:58.9322293Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2444", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:38:58.9344134Z AUTOTUNE mm(512x220, 220x512) 2025-09-07T10:38:58.9344486Z strides: [1, 512], [512, 1] 2025-09-07T10:38:58.9344772Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:58.9345592Z triton_mm_2444 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:58.9346356Z mm 0.0102 ms 90.0% 2025-09-07T10:38:58.9347059Z triton_mm_2443 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:58.9348271Z triton_mm_2446 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:58.9349492Z triton_mm_2447 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.9350689Z triton_mm_2449 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:58.9351893Z triton_mm_2438 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:58.9353093Z triton_mm_2439 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:58.9354304Z triton_mm_2445 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.9355511Z triton_mm_2448 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:58.9356547Z SingleProcess AUTOTUNE benchmarking takes 0.2403 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:38:59.4556319Z Autotune Choices Stats: 2025-09-07T10:38:59.4558246Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_mm_2544", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:38:59.4579814Z AUTOTUNE mm(512x2040, 2040x512) 2025-09-07T10:38:59.4580120Z strides: [1, 512], [512, 1] 2025-09-07T10:38:59.4580416Z dtypes: torch.float16, torch.float16 2025-09-07T10:38:59.4580730Z mm 0.0184 ms 100.0% 2025-09-07T10:38:59.4581447Z triton_mm_2544 0.0205 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:59.4582667Z triton_mm_2543 0.0256 ms 72.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:38:59.4583893Z triton_mm_2537 0.0276 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:38:59.4585104Z triton_mm_2547 0.0297 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:38:59.4586305Z triton_mm_2539 0.0317 ms 58.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:59.4587716Z triton_mm_2538 0.0328 ms 56.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:38:59.4588928Z triton_mm_2546 0.0338 ms 54.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:59.4590128Z triton_mm_2540 0.0348 ms 52.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:38:59.4591335Z triton_mm_2549 0.0348 ms 52.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:38:59.4592379Z SingleProcess AUTOTUNE benchmarking takes 0.3083 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:39:00.2341987Z Autotune Choices Stats: 2025-09-07T10:39:00.2350946Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_mm_5132", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:39:00.2365975Z AUTOTUNE mm(220x512, 512x512) 2025-09-07T10:39:00.2366301Z strides: [512, 1], [512, 1] 2025-09-07T10:39:00.2366623Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:00.2366940Z mm 0.0113 ms 100.0% 2025-09-07T10:39:00.2367639Z triton_mm_5132 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:00.2368851Z triton_mm_5134 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:00.2370080Z triton_mm_5138 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:00.2371610Z triton_mm_5131 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:00.2372811Z triton_mm_5133 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:00.2374005Z triton_mm_5137 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:00.2375201Z triton_mm_5141 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:00.2376395Z triton_mm_5140 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:00.2377689Z triton_mm_5143 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:00.2378733Z SingleProcess AUTOTUNE benchmarking takes 0.2465 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:39:01.3012239Z Autotune Choices Stats: 2025-09-07T10:39:01.3013486Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2330", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:39:01.3036101Z AUTOTUNE mm(220x1014, 1014x512) 2025-09-07T10:39:01.3036424Z strides: [1014, 1], [512, 1] 2025-09-07T10:39:01.3036712Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:01.3037503Z triton_mm_2330 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.3038266Z mm 0.0174 ms 88.2% 2025-09-07T10:39:01.3038975Z triton_mm_2329 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.3040188Z triton_mm_2332 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.3041388Z triton_mm_2336 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.3042600Z triton_mm_2335 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:01.3044011Z triton_mm_2339 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:01.3045215Z triton_mm_2331 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.3046421Z triton_mm_2338 0.0236 ms 65.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:01.3047626Z triton_mm_2337 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:01.3048662Z SingleProcess AUTOTUNE benchmarking takes 0.2707 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:39:01.5923878Z Autotune Choices Stats: 2025-09-07T10:39:01.5925307Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_2386", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:39:01.5947597Z AUTOTUNE mm(220x2048, 2048x512) 2025-09-07T10:39:01.5948009Z strides: [2048, 1], [512, 1] 2025-09-07T10:39:01.5948314Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:01.5949089Z triton_mm_2386 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.5949876Z mm 0.0164 ms 93.7% 2025-09-07T10:39:01.5950573Z triton_mm_2390 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.5951771Z triton_mm_2384 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.5952964Z triton_mm_2385 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.5954139Z triton_mm_2389 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:01.5955627Z triton_mm_2383 0.0256 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.5956826Z triton_mm_2393 0.0266 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:01.5958037Z triton_mm_2392 0.0328 ms 46.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:01.5959233Z triton_mm_2395 0.0348 ms 44.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:01.5960264Z SingleProcess AUTOTUNE benchmarking takes 0.2906 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:39:01.7576891Z Autotune Choices Stats: 2025-09-07T10:39:01.7578132Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_bmm_2685", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:39:01.7600206Z AUTOTUNE bmm(80x64x22, 80x22x22) 2025-09-07T10:39:01.7600534Z strides: [1408, 1, 64], [484, 22, 1] 2025-09-07T10:39:01.7600861Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:01.7601636Z triton_bmm_2685 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.7603050Z triton_bmm_2686 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.7604270Z triton_bmm_2687 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.7605494Z triton_bmm_2688 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.7606931Z triton_bmm_2689 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.7608161Z triton_bmm_2690 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.7609382Z triton_bmm_2691 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:01.7610603Z triton_bmm_2692 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:01.7611818Z triton_bmm_2693 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:01.7613021Z triton_bmm_2694 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:01.7614057Z SingleProcess AUTOTUNE benchmarking takes 0.1583 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T10:39:01.9197389Z Autotune Choices Stats: 2025-09-07T10:39:01.9198564Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_bmm_2663", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:39:01.9220928Z AUTOTUNE bmm(80x22x22, 80x22x64) 2025-09-07T10:39:01.9221240Z strides: [484, 1, 22], [1408, 64, 1] 2025-09-07T10:39:01.9221542Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:01.9222321Z triton_bmm_2663 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.9223540Z triton_bmm_2664 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:01.9224755Z triton_bmm_2665 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.9225970Z triton_bmm_2666 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.9227192Z triton_bmm_2667 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:01.9228391Z triton_bmm_2668 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:01.9229602Z triton_bmm_2669 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:01.9230817Z triton_bmm_2670 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:01.9232032Z triton_bmm_2671 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:01.9233256Z triton_bmm_2672 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:01.9234474Z SingleProcess AUTOTUNE benchmarking takes 0.1581 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T10:39:02.1973205Z Autotune Choices Stats: 2025-09-07T10:39:02.1974394Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_bmm_2455", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:39:02.1999966Z AUTOTUNE bmm(80x204x22, 80x22x64) 2025-09-07T10:39:02.2000333Z strides: [4544, 1, 204], [1408, 64, 1] 2025-09-07T10:39:02.2000677Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:02.2001450Z triton_bmm_2455 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:02.2002688Z triton_bmm_2456 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:02.2004108Z triton_bmm_2457 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:02.2005327Z triton_bmm_2460 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:02.2006543Z triton_bmm_2461 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:02.2008022Z triton_bmm_2462 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:02.2009229Z triton_bmm_2463 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:02.2010506Z triton_bmm_2464 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:02.2011732Z triton_bmm_2465 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:02.2012962Z triton_bmm_2466 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:02.2014013Z SingleProcess AUTOTUNE benchmarking takes 0.2069 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:39:02.4045450Z Autotune Choices Stats: 2025-09-07T10:39:02.4046649Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_bmm_2489", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:39:02.4069053Z AUTOTUNE bmm(80x64x22, 80x22x204) 2025-09-07T10:39:02.4069382Z strides: [1408, 1, 64], [4544, 204, 1] 2025-09-07T10:39:02.4069710Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:02.4070484Z triton_bmm_2489 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:02.4071721Z triton_bmm_2487 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:02.4073147Z triton_bmm_2488 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:02.4074356Z triton_bmm_2490 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:02.4075570Z triton_bmm_2491 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:02.4076783Z triton_bmm_2492 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:02.4077996Z triton_bmm_2493 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:02.4079210Z triton_bmm_2494 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:02.4080414Z triton_bmm_2495 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:02.4081638Z triton_bmm_2496 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:02.4082681Z SingleProcess AUTOTUNE benchmarking takes 0.2064 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:39:02.5059441Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5059816Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5060164Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5060489Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5060830Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5061188Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5061526Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5061849Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5062189Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5062529Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5062870Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5063194Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5063532Z cudagraph partition due to non gpu ops 2025-09-07T10:39:02.5063877Z cudagraph partition due to DeviceCopy ops 2025-09-07T10:39:02.6194143Z cudagraph partition into 2 partitions 2025-09-07T10:39:07.2292761Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/torchbench/torchbenchmark/models/speech_transformer/speech_transformer/transformer/decoder.py", line 126, in torch_dynamo_resume_in_forward_at_120 2025-09-07T10:39:07.2295418Z self.tgt_word_emb(ys_in_pad) * self.x_logit_scale 2025-09-07T10:39:07.2295933Z 2025-09-07T10:39:07.2295941Z 2025-09-07T10:39:08.1720590Z Run failed with return code: -11 2025-09-07T10:39:08.1720980Z Output: None 2025-09-07T10:39:08.1721209Z Error: None 2025-09-07T10:39:08.7765769Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:39:08.7768413Z import pynvml # type: ignore[import] 2025-09-07T10:39:11.4756784Z 2025-09-07T10:39:12.8995619Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:39:12.8996246Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:39:12.8996851Z cuda train squeezenet1_1 2025-09-07T10:39:25.3608484Z Autotune Choices Stats: 2025-09-07T10:39:25.3610465Z {"num_choices": 17, "num_triton_choices": 15, "best_kernel": "triton_mm_60", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:39:25.3611965Z AUTOTUNE addmm(12100x64, 12100x16, 16x64) 2025-09-07T10:39:25.3700018Z strides: [0, 1], [16, 1], [1, 16] 2025-09-07T10:39:25.3700421Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:25.3701233Z triton_mm_60 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:25.3702458Z triton_mm_61 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:25.3703640Z triton_mm_62 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:25.3704834Z triton_mm_63 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:25.3706006Z triton_mm_64 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:25.3707186Z triton_mm_65 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:25.3708648Z triton_mm_66 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:25.3709824Z triton_mm_67 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:25.3711004Z triton_mm_68 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:25.3712181Z triton_mm_69 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:25.3713207Z SingleProcess AUTOTUNE benchmarking takes 0.3866 seconds and 0.0004 seconds precompiling for 17 choices 2025-09-07T10:39:27.7432886Z Autotune Choices Stats: 2025-09-07T10:39:27.7434470Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_349", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:39:27.7459446Z AUTOTUNE addmm(676x1000, 676x512, 512x1000) 2025-09-07T10:39:27.7459811Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:39:27.7460228Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:27.7460601Z bias_addmm 0.0143 ms 100.0% 2025-09-07T10:39:27.7461343Z triton_mm_349 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:27.7462555Z triton_mm_345 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:27.7463743Z triton_mm_348 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:27.7465240Z triton_mm_351 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:27.7466446Z triton_mm_347 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:27.7467640Z triton_mm_350 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:27.7468840Z triton_mm_355 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:27.7470030Z triton_mm_346 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:27.7471216Z triton_mm_354 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:27.7472240Z SingleProcess AUTOTUNE benchmarking takes 0.2914 seconds and 1.7263 seconds precompiling for 20 choices 2025-09-07T10:39:28.2404979Z Autotune Choices Stats: 2025-09-07T10:39:28.2406203Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_138", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:39:28.2432648Z AUTOTUNE addmm(2916x128, 2916x32, 32x128) 2025-09-07T10:39:28.2433055Z strides: [0, 1], [32, 1], [1, 32] 2025-09-07T10:39:28.2433403Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:28.2434229Z triton_mm_138 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:28.2435549Z triton_mm_139 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:28.2436744Z triton_mm_140 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:28.2437923Z triton_mm_141 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:28.2439108Z triton_mm_142 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:28.2440288Z triton_mm_143 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:28.2441467Z triton_mm_144 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:28.2442650Z triton_mm_145 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:28.2444096Z triton_mm_146 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:28.2445279Z triton_mm_147 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:28.2446614Z SingleProcess AUTOTUNE benchmarking takes 0.2621 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:39:28.7382782Z Autotune Choices Stats: 2025-09-07T10:39:28.7383990Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_7", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:39:28.7410026Z AUTOTUNE addmm(12100x16, 12100x64, 64x16) 2025-09-07T10:39:28.7410392Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:39:28.7410743Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:28.7411553Z triton_mm_7 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:39:28.7412742Z triton_mm_9 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:28.7413925Z triton_mm_13 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:28.7415106Z triton_mm_15 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:28.7416270Z triton_mm_16 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:28.7417850Z triton_mm_20 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:28.7419039Z triton_mm_6 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:28.7420206Z triton_mm_8 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:39:28.7421383Z triton_mm_10 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:28.7422547Z triton_mm_11 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:28.7423562Z SingleProcess AUTOTUNE benchmarking takes 0.2495 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:39:29.2690739Z Autotune Choices Stats: 2025-09-07T10:39:29.2691987Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_45", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:39:29.2718010Z AUTOTUNE addmm(12100x16, 12100x128, 128x16) 2025-09-07T10:39:29.2718362Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T10:39:29.2718722Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:29.2719522Z triton_mm_45 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:39:29.2720727Z triton_mm_46 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:29.2722229Z triton_mm_50 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:29.2723641Z triton_mm_51 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:29.2724818Z triton_mm_52 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:29.2726003Z triton_mm_54 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:29.2727193Z triton_mm_55 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:29.2728372Z triton_mm_58 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:29.2729128Z bias_addmm 0.0123 ms 91.7% 2025-09-07T10:39:29.2729849Z triton_mm_43 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:29.2730872Z SingleProcess AUTOTUNE benchmarking takes 0.2489 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:39:29.5750322Z Autotune Choices Stats: 2025-09-07T10:39:29.5751946Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_316", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:39:29.5777835Z AUTOTUNE addmm(676x256, 676x64, 64x256) 2025-09-07T10:39:29.5778175Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:39:29.5778540Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:29.5779344Z triton_mm_316 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:29.5780104Z bias_addmm 0.0082 ms 87.5% 2025-09-07T10:39:29.5780831Z triton_mm_312 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:29.5782018Z triton_mm_313 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:29.5783190Z triton_mm_314 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:29.5784379Z triton_mm_315 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:29.5785536Z triton_mm_318 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:29.5786709Z triton_mm_319 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:29.5787893Z triton_mm_320 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:29.5789067Z triton_mm_317 0.0092 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:29.5790334Z SingleProcess AUTOTUNE benchmarking takes 0.2845 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T10:39:30.1104981Z Autotune Choices Stats: 2025-09-07T10:39:30.1106203Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_228", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:39:30.1134059Z AUTOTUNE addmm(676x192, 676x48, 48x192) 2025-09-07T10:39:30.1134460Z strides: [0, 1], [48, 1], [1, 48] 2025-09-07T10:39:30.1134816Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:30.1135638Z triton_mm_228 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:30.1136848Z triton_mm_224 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:39:30.1138090Z triton_mm_225 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:30.1139289Z triton_mm_226 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:30.1140476Z triton_mm_227 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:30.1141964Z triton_mm_230 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:30.1143151Z triton_mm_231 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:30.1144323Z triton_mm_232 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:30.1145075Z bias_addmm 0.0092 ms 77.8% 2025-09-07T10:39:30.1145798Z triton_mm_229 0.0092 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:30.1146823Z SingleProcess AUTOTUNE benchmarking takes 0.2877 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T10:39:30.6360592Z Autotune Choices Stats: 2025-09-07T10:39:30.6361818Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_82", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:39:30.6392494Z AUTOTUNE addmm(2916x32, 2916x128, 128x32) 2025-09-07T10:39:30.6392838Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T10:39:30.6393202Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:30.6394003Z triton_mm_82 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:30.6395229Z triton_mm_84 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:30.6396418Z triton_mm_91 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:30.6397925Z triton_mm_81 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:30.6399117Z triton_mm_83 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:30.6400290Z triton_mm_87 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:30.6401467Z triton_mm_88 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:30.6402619Z triton_mm_89 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:30.6403942Z triton_mm_90 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:30.6405115Z triton_mm_92 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:30.6406150Z SingleProcess AUTOTUNE benchmarking takes 0.2799 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:39:31.1893052Z Autotune Choices Stats: 2025-09-07T10:39:31.1894697Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_122", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:39:31.1924292Z AUTOTUNE addmm(2916x32, 2916x256, 256x32) 2025-09-07T10:39:31.1924638Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T10:39:31.1925007Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:31.1925817Z triton_mm_122 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:31.1927021Z triton_mm_123 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:31.1928217Z triton_mm_124 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:31.1929421Z triton_mm_125 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:31.1930616Z triton_mm_128 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:31.1931803Z triton_mm_130 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:31.1932977Z triton_mm_131 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:31.1934158Z triton_mm_132 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:31.1935356Z triton_mm_136 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:31.1936113Z bias_addmm 0.0113 ms 90.9% 2025-09-07T10:39:31.1936909Z SingleProcess AUTOTUNE benchmarking takes 0.2653 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:39:31.4920306Z Autotune Choices Stats: 2025-09-07T10:39:31.4921525Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_251", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:39:31.4950777Z AUTOTUNE addmm(676x64, 676x384, 384x64) 2025-09-07T10:39:31.4951099Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:39:31.4951458Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:31.4952276Z triton_mm_251 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:31.4953504Z triton_mm_254 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:31.4954270Z bias_addmm 0.0113 ms 90.9% 2025-09-07T10:39:31.4954979Z triton_mm_252 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:31.4956154Z triton_mm_253 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:31.4957637Z triton_mm_257 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:31.4958821Z triton_mm_258 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:31.4960013Z triton_mm_261 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:31.4961194Z triton_mm_262 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:31.4962362Z triton_mm_259 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:31.4963530Z SingleProcess AUTOTUNE benchmarking takes 0.2810 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:39:32.0568903Z Autotune Choices Stats: 2025-09-07T10:39:32.0570111Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_298", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:39:32.0600052Z AUTOTUNE addmm(676x64, 676x512, 512x64) 2025-09-07T10:39:32.0600423Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:39:32.0600771Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:32.0601604Z triton_mm_298 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.0602400Z bias_addmm 0.0113 ms 90.9% 2025-09-07T10:39:32.0603392Z triton_mm_295 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:32.0609767Z triton_mm_296 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.0611048Z triton_mm_302 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.0612265Z triton_mm_306 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:32.0613445Z triton_mm_297 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.0614637Z triton_mm_301 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:32.0615819Z triton_mm_305 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:32.0616990Z triton_mm_304 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:32.0618135Z SingleProcess AUTOTUNE benchmarking takes 0.2854 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:39:32.3557111Z Autotune Choices Stats: 2025-09-07T10:39:32.3558653Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_mm_163", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:32.3589214Z AUTOTUNE addmm(676x48, 676x256, 256x48) 2025-09-07T10:39:32.3589566Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T10:39:32.3589945Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:32.3590315Z bias_addmm 0.0092 ms 100.0% 2025-09-07T10:39:32.3591068Z triton_mm_163 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:32.3592273Z triton_mm_164 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.3593453Z triton_mm_165 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.3594644Z triton_mm_166 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.3595836Z triton_mm_173 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:32.3597017Z triton_mm_169 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:32.3598198Z triton_mm_170 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.3599380Z triton_mm_172 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:32.3600556Z triton_mm_174 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:32.3601833Z SingleProcess AUTOTUNE benchmarking takes 0.2777 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:39:32.9177111Z Autotune Choices Stats: 2025-09-07T10:39:32.9178717Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.010239999741315842, "best_triton_pos": 1, "best_triton_time": 0.010239999741315842, "best_triton_kernel": "triton_mm_207", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:32.9209140Z AUTOTUNE addmm(676x48, 676x384, 384x48) 2025-09-07T10:39:32.9209480Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:39:32.9209821Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:39:32.9210194Z bias_addmm 0.0102 ms 100.0% 2025-09-07T10:39:32.9210946Z triton_mm_207 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:39:32.9212145Z triton_mm_208 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.9213339Z triton_mm_209 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:39:32.9214519Z triton_mm_210 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.9216026Z triton_mm_214 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:39:32.9217322Z triton_mm_217 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:39:32.9218526Z triton_mm_218 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:39:32.9219717Z triton_mm_213 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:39:32.9220896Z triton_mm_216 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:39:32.9221931Z SingleProcess AUTOTUNE benchmarking takes 0.2812 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:39:33.4422055Z Autotune Choices Stats: 2025-09-07T10:39:33.4423800Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.025599999353289604, "best_triton_kernel": "triton_convolution2d_0", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:33.4456282Z AUTOTUNE convolution(4x3x224x224, 64x3x3x3) 2025-09-07T10:39:33.4456658Z strides: [150528, 1, 672, 3], [27, 1, 9, 3] 2025-09-07T10:39:33.4456997Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:33.4457407Z convolution 0.0195 ms 100.0% 2025-09-07T10:39:33.4458297Z triton_convolution2d_0 0.0256 ms 76.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.4460118Z triton_convolution2d_3 0.0256 ms 76.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.4461612Z triton_convolution2d_4 0.0287 ms 67.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.4463084Z triton_convolution2d_5 0.0287 ms 67.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.4464573Z triton_convolution2d_2 0.0338 ms 57.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:39:33.4466056Z triton_convolution2d_1 0.0512 ms 38.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.4467220Z SingleProcess AUTOTUNE benchmarking takes 0.1606 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:39:33.5458250Z Autotune Choices Stats: 2025-09-07T10:39:33.5459890Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_convolution2d_37", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:33.5491916Z AUTOTUNE convolution(4x16x55x55, 64x16x3x3) 2025-09-07T10:39:33.5492287Z strides: [48400, 1, 880, 16], [144, 1, 48, 16] 2025-09-07T10:39:33.5492645Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:33.5492952Z convolution 0.0133 ms 100.0% 2025-09-07T10:39:33.5493855Z triton_convolution2d_37 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.5495340Z triton_convolution2d_41 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.5496826Z triton_convolution2d_38 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.5498367Z triton_convolution2d_40 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.5499855Z triton_convolution2d_42 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.5501345Z triton_convolution2d_39 0.0195 ms 68.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:39:33.5502510Z SingleProcess AUTOTUNE benchmarking takes 0.1025 seconds and 0.0003 seconds precompiling for 7 choices 2025-09-07T10:39:33.6658472Z Autotune Choices Stats: 2025-09-07T10:39:33.6660314Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_convolution2d_118", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:33.6690880Z AUTOTUNE convolution(4x32x27x27, 128x32x3x3) 2025-09-07T10:39:33.6691268Z strides: [23328, 1, 864, 32], [288, 1, 96, 32] 2025-09-07T10:39:33.6691620Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:33.6691928Z convolution 0.0123 ms 100.0% 2025-09-07T10:39:33.6692805Z triton_convolution2d_118 0.0143 ms 85.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.6694302Z triton_convolution2d_119 0.0154 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.6695803Z triton_convolution2d_114 0.0174 ms 70.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.6697372Z triton_convolution2d_120 0.0174 ms 70.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.6698881Z triton_convolution2d_117 0.0184 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.6700533Z triton_convolution2d_115 0.0215 ms 57.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.6702041Z triton_convolution2d_116 0.0297 ms 41.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:39:33.6703210Z SingleProcess AUTOTUNE benchmarking takes 0.1177 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:39:33.7935055Z Autotune Choices Stats: 2025-09-07T10:39:33.7936683Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_convolution2d_203", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:33.7967430Z AUTOTUNE convolution(4x48x13x13, 192x48x3x3) 2025-09-07T10:39:33.7967831Z strides: [8112, 1, 624, 48], [432, 1, 144, 48] 2025-09-07T10:39:33.7968180Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:33.7968487Z convolution 0.0143 ms 100.0% 2025-09-07T10:39:33.7969381Z triton_convolution2d_203 0.0205 ms 70.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.7970872Z triton_convolution2d_202 0.0266 ms 53.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.7972368Z triton_convolution2d_204 0.0287 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.7973852Z triton_convolution2d_199 0.0297 ms 48.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.7975544Z triton_convolution2d_205 0.0307 ms 46.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.7977042Z triton_convolution2d_200 0.0348 ms 41.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.7978629Z triton_convolution2d_201 0.0440 ms 32.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:39:33.7979800Z SingleProcess AUTOTUNE benchmarking takes 0.1256 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:39:33.9239459Z Autotune Choices Stats: 2025-09-07T10:39:33.9241120Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_convolution2d_291", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:39:33.9271274Z AUTOTUNE convolution(4x64x13x13, 256x64x3x3) 2025-09-07T10:39:33.9271661Z strides: [10816, 1, 832, 64], [576, 1, 192, 64] 2025-09-07T10:39:33.9272025Z dtypes: torch.float16, torch.float16 2025-09-07T10:39:33.9272334Z convolution 0.0123 ms 100.0% 2025-09-07T10:39:33.9273441Z triton_convolution2d_291 0.0205 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.9274954Z triton_convolution2d_290 0.0287 ms 42.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.9276461Z triton_convolution2d_292 0.0297 ms 41.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.9277962Z triton_convolution2d_293 0.0328 ms 37.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:39:33.9279464Z triton_convolution2d_288 0.0379 ms 32.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.9280952Z triton_convolution2d_287 0.0389 ms 31.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:39:33.9282455Z triton_convolution2d_289 0.0553 ms 22.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:39:33.9283761Z SingleProcess AUTOTUNE benchmarking takes 0.1283 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:39:49.2877191Z pass 2025-09-07T10:39:53.1225551Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:39:53.1228427Z import pynvml # type: ignore[import] 2025-09-07T10:39:56.1338253Z 2025-09-07T10:39:57.9166064Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:39:57.9166338Z 2025-09-07T10:39:58.6679425Z Loading pipeline components...: 0% 0/6 [00:00 will be ignored 2025-09-07T10:47:56.9858036Z pass 2025-09-07T10:48:02.9168919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:48:02.9170335Z import pynvml # type: ignore[import] 2025-09-07T10:48:05.6927096Z 2025-09-07T10:48:09.3102884Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:48:09.3103302Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:48:09.3103616Z cuda train timm_nfnet 2025-09-07T10:48:35.0574364Z Autotune Choices Stats: 2025-09-07T10:48:35.0575746Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_12", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.03174399957060814, "best_triton_pos": 0} 2025-09-07T10:48:35.0647024Z AUTOTUNE convolution(4x32x96x96, 64x32x3x3) 2025-09-07T10:48:35.0647434Z strides: [294912, 9216, 96, 1], [288, 9, 3, 1] 2025-09-07T10:48:35.0647783Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:35.0648725Z triton_convolution2d_12 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:35.0650614Z triton_convolution2d_14 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:35.0652111Z triton_convolution2d_15 0.0328 ms 96.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:35.0653592Z triton_convolution2d_11 0.0338 ms 93.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:35.0655076Z triton_convolution2d_16 0.0358 ms 88.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:35.0656569Z triton_convolution2d_17 0.0358 ms 88.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:35.0657586Z convolution 0.0420 ms 75.6% 2025-09-07T10:48:35.0658463Z triton_convolution2d_13 0.0881 ms 36.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:48:35.0659843Z SingleProcess AUTOTUNE benchmarking takes 0.1411 seconds and 0.0004 seconds precompiling for 8 choices 2025-09-07T10:48:35.5410819Z Autotune Choices Stats: 2025-09-07T10:48:35.5412159Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_29", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:48:35.5521698Z AUTOTUNE convolution(4x128x48x48, 256x128x1x1) 2025-09-07T10:48:35.5522117Z strides: [294912, 2304, 48, 1], [128, 1, 1, 1] 2025-09-07T10:48:35.5522469Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:35.5523578Z triton_convolution2d_29 0.0154 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:35.5525095Z triton_convolution2d_28 0.0174 ms 88.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:35.5526589Z triton_convolution2d_30 0.0174 ms 88.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:35.5528092Z triton_convolution2d_31 0.0174 ms 88.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:35.5529577Z triton_convolution2d_26 0.0184 ms 83.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:35.5531052Z triton_convolution2d_25 0.0205 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:35.5532545Z triton_convolution2d_27 0.0215 ms 71.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:35.5533727Z convolution 0.0256 ms 60.0% 2025-09-07T10:48:35.5534017Z conv1x1_via_mm 0.0696 ms 22.1% 2025-09-07T10:48:35.5534582Z SingleProcess AUTOTUNE benchmarking takes 0.1591 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:36.1168153Z Autotune Choices Stats: 2025-09-07T10:48:36.1169518Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_82", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T10:48:36.1241684Z AUTOTUNE convolution(4x256x48x48, 256x256x1x1) 2025-09-07T10:48:36.1242089Z strides: [589824, 2304, 48, 1], [256, 1, 1, 1] 2025-09-07T10:48:36.1242436Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:36.1243739Z triton_convolution2d_82 0.0205 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.1245236Z triton_convolution2d_81 0.0225 ms 90.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.1246718Z triton_convolution2d_84 0.0225 ms 90.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.1248522Z triton_convolution2d_79 0.0256 ms 80.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.1250008Z triton_convolution2d_78 0.0266 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.1251489Z triton_convolution2d_83 0.0317 ms 64.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.1252971Z triton_convolution2d_80 0.0338 ms 60.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:36.1253884Z convolution 0.0379 ms 54.1% 2025-09-07T10:48:36.1254191Z conv1x1_via_mm 0.0881 ms 23.3% 2025-09-07T10:48:36.1254744Z SingleProcess AUTOTUNE benchmarking takes 0.1632 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:36.5711307Z Autotune Choices Stats: 2025-09-07T10:48:36.5712659Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_141", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.02457600086927414, "best_triton_pos": 0} 2025-09-07T10:48:36.5783801Z AUTOTUNE convolution(4x512x24x24, 768x512x1x1) 2025-09-07T10:48:36.5784245Z strides: [294912, 576, 24, 1], [512, 1, 1, 1] 2025-09-07T10:48:36.5784581Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:36.5785498Z triton_convolution2d_141 0.0246 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.5787021Z triton_convolution2d_142 0.0297 ms 82.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.5788811Z triton_convolution2d_143 0.0297 ms 82.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.5790319Z triton_convolution2d_140 0.0307 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:36.5791233Z convolution 0.0338 ms 72.7% 2025-09-07T10:48:36.5792115Z triton_convolution2d_138 0.0369 ms 66.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.5793595Z triton_convolution2d_137 0.0410 ms 60.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:36.5795089Z triton_convolution2d_139 0.0563 ms 43.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:36.5795998Z conv1x1_via_mm 0.0819 ms 30.0% 2025-09-07T10:48:36.5796568Z SingleProcess AUTOTUNE benchmarking takes 0.1674 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:36.9754369Z Autotune Choices Stats: 2025-09-07T10:48:36.9755719Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_5", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:48:36.9826686Z AUTOTUNE convolution(4x16x96x96, 32x16x3x3) 2025-09-07T10:48:36.9827094Z strides: [147456, 9216, 96, 1], [144, 9, 3, 1] 2025-09-07T10:48:36.9827464Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:36.9828366Z triton_convolution2d_5 0.0164 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:36.9829863Z triton_convolution2d_6 0.0164 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:36.9831360Z triton_convolution2d_8 0.0164 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:36.9832859Z triton_convolution2d_10 0.0164 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:36.9834347Z triton_convolution2d_9 0.0174 ms 94.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:36.9835825Z triton_convolution2d_7 0.0276 ms 59.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:48:36.9836744Z convolution 0.0307 ms 53.3% 2025-09-07T10:48:36.9837314Z SingleProcess AUTOTUNE benchmarking takes 0.1077 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:48:37.4795795Z Autotune Choices Stats: 2025-09-07T10:48:37.4797847Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03993599861860275, "best_triton_pos": 1, "best_triton_time": 0.043007999658584595, "best_triton_kernel": "triton_convolution2d_24", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:48:37.4868585Z AUTOTUNE convolution(4x64x97x97, 128x64x3x3) 2025-09-07T10:48:37.4868984Z strides: [602176, 9409, 97, 1], [576, 9, 3, 1] 2025-09-07T10:48:37.4869341Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:37.4869662Z convolution 0.0399 ms 100.0% 2025-09-07T10:48:37.4870563Z triton_convolution2d_24 0.0430 ms 92.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:37.4872092Z triton_convolution2d_21 0.0502 ms 79.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:37.4873586Z triton_convolution2d_19 0.0532 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:37.4875068Z triton_convolution2d_18 0.0563 ms 70.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:37.4876543Z triton_convolution2d_23 0.0625 ms 63.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:37.4881999Z triton_convolution2d_22 0.0655 ms 60.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:37.4883751Z triton_convolution2d_20 0.1495 ms 26.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:48:37.4884929Z SingleProcess AUTOTUNE benchmarking takes 0.1590 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:37.9733910Z Autotune Choices Stats: 2025-09-07T10:48:37.9735289Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_36", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:48:37.9806863Z AUTOTUNE convolution(4x128x48x48, 128x128x1x1) 2025-09-07T10:48:37.9807288Z strides: [294912, 2304, 48, 1], [128, 1, 1, 1] 2025-09-07T10:48:37.9807635Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:37.9808561Z triton_convolution2d_36 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:37.9810054Z triton_convolution2d_37 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:37.9811535Z triton_convolution2d_35 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:37.9813022Z triton_convolution2d_38 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:37.9814854Z triton_convolution2d_32 0.0164 ms 81.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:37.9816347Z triton_convolution2d_33 0.0164 ms 81.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:37.9817251Z convolution 0.0174 ms 76.5% 2025-09-07T10:48:37.9818216Z triton_convolution2d_34 0.0195 ms 68.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:37.9819125Z conv1x1_via_mm 0.0522 ms 25.5% 2025-09-07T10:48:37.9819693Z SingleProcess AUTOTUNE benchmarking takes 0.1519 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:38.4955703Z Autotune Choices Stats: 2025-09-07T10:48:38.4957411Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.043007999658584595, "best_triton_pos": 1, "best_triton_time": 0.07168000191450119, "best_triton_kernel": "triton_convolution2d_45", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:48:38.5029964Z AUTOTUNE convolution(4x128x48x48, 128x128x3x3) 2025-09-07T10:48:38.5030389Z strides: [294912, 2304, 48, 1], [1152, 9, 3, 1] 2025-09-07T10:48:38.5031110Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:38.5031439Z convolution 0.0430 ms 100.0% 2025-09-07T10:48:38.5032332Z triton_convolution2d_45 0.0717 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:38.5033822Z triton_convolution2d_43 0.0891 ms 48.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:38.5035314Z triton_convolution2d_40 0.0922 ms 46.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:38.5036805Z triton_convolution2d_42 0.0922 ms 46.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:38.5038296Z triton_convolution2d_39 0.0983 ms 43.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:38.5039794Z triton_convolution2d_44 0.1126 ms 38.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:38.5041290Z triton_convolution2d_41 0.1915 ms 22.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:48:38.5042467Z SingleProcess AUTOTUNE benchmarking takes 0.1883 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:38.7747980Z Autotune Choices Stats: 2025-09-07T10:48:38.7749346Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_75", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:48:38.7820877Z AUTOTUNE convolution(4x256x24x24, 512x256x1x1) 2025-09-07T10:48:38.7821611Z strides: [147456, 576, 24, 1], [256, 1, 1, 1] 2025-09-07T10:48:38.7822061Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:38.7823106Z triton_convolution2d_75 0.0154 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:38.7824617Z triton_convolution2d_74 0.0184 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:38.7826116Z triton_convolution2d_76 0.0184 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:38.7827599Z triton_convolution2d_77 0.0184 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:38.7829074Z triton_convolution2d_72 0.0225 ms 68.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:38.7830546Z triton_convolution2d_71 0.0236 ms 65.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:38.7831620Z convolution 0.0266 ms 57.7% 2025-09-07T10:48:38.7832502Z triton_convolution2d_73 0.0307 ms 50.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:38.7833400Z conv1x1_via_mm 0.0532 ms 28.8% 2025-09-07T10:48:38.7833978Z SingleProcess AUTOTUNE benchmarking takes 0.1550 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:39.3448189Z Autotune Choices Stats: 2025-09-07T10:48:39.3449851Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01740800030529499, "best_triton_pos": 1, "best_triton_time": 0.02252800017595291, "best_triton_kernel": "triton_convolution2d_134", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:39.3522642Z AUTOTUNE convolution(4x512x12x12, 1536x512x1x1) 2025-09-07T10:48:39.3523117Z strides: [73728, 144, 12, 1], [512, 1, 1, 1] 2025-09-07T10:48:39.3523465Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:39.3523796Z convolution 0.0174 ms 100.0% 2025-09-07T10:48:39.3524715Z triton_convolution2d_134 0.0225 ms 77.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:39.3526210Z triton_convolution2d_133 0.0307 ms 56.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:39.3527704Z triton_convolution2d_135 0.0307 ms 56.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:39.3529182Z triton_convolution2d_136 0.0317 ms 54.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:39.3530959Z triton_convolution2d_130 0.0389 ms 44.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:39.3532459Z triton_convolution2d_131 0.0389 ms 44.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:39.3533960Z triton_convolution2d_132 0.0502 ms 34.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:39.3534881Z conv1x1_via_mm 0.0645 ms 27.0% 2025-09-07T10:48:39.3535447Z SingleProcess AUTOTUNE benchmarking takes 0.1662 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:39.8453835Z Autotune Choices Stats: 2025-09-07T10:48:39.8462573Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_convolution2d_0", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:48:39.8528863Z AUTOTUNE convolution(4x3x193x193, 16x3x3x3) 2025-09-07T10:48:39.8529238Z strides: [111747, 37249, 193, 1], [27, 9, 3, 1] 2025-09-07T10:48:39.8529596Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:39.8530511Z triton_convolution2d_0 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:39.8532311Z triton_convolution2d_3 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:39.8533816Z triton_convolution2d_4 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:48:39.8535304Z triton_convolution2d_1 0.0164 ms 81.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:48:39.8536202Z convolution 0.0184 ms 72.2% 2025-09-07T10:48:39.8537085Z triton_convolution2d_2 0.0184 ms 72.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:48:39.8538350Z SingleProcess AUTOTUNE benchmarking takes 0.0929 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T10:48:40.3582668Z Autotune Choices Stats: 2025-09-07T10:48:40.3584064Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_108", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.02252800017595291, "best_triton_pos": 0} 2025-09-07T10:48:40.3659662Z AUTOTUNE convolution(4x512x24x24, 256x512x1x1) 2025-09-07T10:48:40.3660043Z strides: [294912, 576, 24, 1], [512, 1, 1, 1] 2025-09-07T10:48:40.3660379Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:40.3661298Z triton_convolution2d_108 0.0225 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.3662240Z convolution 0.0276 ms 81.5% 2025-09-07T10:48:40.3663132Z triton_convolution2d_107 0.0287 ms 78.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.3664956Z triton_convolution2d_109 0.0287 ms 78.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.3666463Z triton_convolution2d_110 0.0287 ms 78.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.3667964Z triton_convolution2d_105 0.0358 ms 62.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.3669445Z triton_convolution2d_104 0.0410 ms 55.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.3670353Z conv1x1_via_mm 0.0512 ms 44.0% 2025-09-07T10:48:40.3671248Z triton_convolution2d_106 0.0512 ms 44.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:40.3672424Z SingleProcess AUTOTUNE benchmarking takes 0.1667 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:40.9063267Z Autotune Choices Stats: 2025-09-07T10:48:40.9064957Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.02457600086927414, "best_triton_pos": 1, "best_triton_time": 0.04915200173854828, "best_triton_kernel": "triton_convolution2d_167", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:40.9138566Z AUTOTUNE convolution(4x1536x12x12, 768x1536x1x1) 2025-09-07T10:48:40.9138990Z strides: [221184, 144, 12, 1], [1536, 1, 1, 1] 2025-09-07T10:48:40.9139362Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:40.9139672Z convolution 0.0246 ms 100.0% 2025-09-07T10:48:40.9140570Z triton_convolution2d_167 0.0492 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.9142077Z triton_convolution2d_166 0.0707 ms 34.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.9143006Z conv1x1_via_mm 0.0727 ms 33.8% 2025-09-07T10:48:40.9143895Z triton_convolution2d_168 0.0737 ms 33.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.9145400Z triton_convolution2d_169 0.0737 ms 33.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:40.9146890Z triton_convolution2d_164 0.0922 ms 26.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.9148363Z triton_convolution2d_163 0.0963 ms 25.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:40.9149866Z triton_convolution2d_165 0.1290 ms 19.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:40.9151385Z SingleProcess AUTOTUNE benchmarking takes 0.2022 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:41.7522400Z Autotune Choices Stats: 2025-09-07T10:48:41.7524314Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.025599999353289604, "best_triton_pos": 1, "best_triton_time": 0.05222399905323982, "best_triton_kernel": "triton_convolution2d_297", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:41.7595024Z AUTOTUNE convolution(4x1536x6x6, 1536x1536x1x1) 2025-09-07T10:48:41.7595427Z strides: [55296, 36, 6, 1], [1536, 1, 1, 1] 2025-09-07T10:48:41.7595757Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:41.7596075Z convolution 0.0256 ms 100.0% 2025-09-07T10:48:41.7596981Z triton_convolution2d_297 0.0522 ms 49.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:41.7598483Z triton_convolution2d_298 0.0676 ms 37.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:41.7599390Z conv1x1_via_mm 0.0727 ms 35.2% 2025-09-07T10:48:41.7600280Z triton_convolution2d_296 0.0768 ms 33.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:41.7602076Z triton_convolution2d_299 0.0778 ms 32.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:41.7603706Z triton_convolution2d_295 0.0819 ms 31.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:41.7605188Z triton_convolution2d_294 0.0973 ms 26.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:41.7606665Z triton_convolution2d_293 0.1014 ms 25.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:41.7607838Z SingleProcess AUTOTUNE benchmarking takes 0.1970 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:42.3224139Z Autotune Choices Stats: 2025-09-07T10:48:42.3225867Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.02457600086927414, "best_triton_pos": 2, "best_triton_time": 0.05119999870657921, "best_triton_kernel": "triton_convolution2d_330", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:42.3297849Z AUTOTUNE convolution(4x1536x6x6, 768x1536x1x1) 2025-09-07T10:48:42.3298228Z strides: [55296, 36, 6, 1], [1536, 1, 1, 1] 2025-09-07T10:48:42.3298554Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:42.3298870Z convolution 0.0246 ms 100.0% 2025-09-07T10:48:42.3299163Z conv1x1_via_mm 0.0471 ms 52.2% 2025-09-07T10:48:42.3300073Z triton_convolution2d_330 0.0512 ms 48.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.3301565Z triton_convolution2d_331 0.0666 ms 36.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:42.3303404Z triton_convolution2d_329 0.0758 ms 32.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:42.3304914Z triton_convolution2d_332 0.0768 ms 32.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:42.3306401Z triton_convolution2d_328 0.0788 ms 31.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:42.3307901Z triton_convolution2d_327 0.0860 ms 28.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.3309397Z triton_convolution2d_326 0.1014 ms 24.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.3310563Z SingleProcess AUTOTUNE benchmarking takes 0.1978 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:42.8058803Z Autotune Choices Stats: 2025-09-07T10:48:42.8060468Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.013311999849975109, "best_triton_pos": 2, "best_triton_time": 0.025599999353289604, "best_triton_kernel": "triton_convolution2d_161", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:42.8131743Z AUTOTUNE convolution(4x768x1x1, 1536x768x1x1) 2025-09-07T10:48:42.8132154Z strides: [768, 1, 1, 1], [768, 1, 1, 1] 2025-09-07T10:48:42.8132511Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:42.8132821Z convolution 0.0133 ms 100.0% 2025-09-07T10:48:42.8133117Z conv1x1_via_mm 0.0164 ms 81.2% 2025-09-07T10:48:42.8134010Z triton_convolution2d_161 0.0256 ms 52.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.8135509Z triton_convolution2d_160 0.0287 ms 46.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:42.8137004Z triton_convolution2d_159 0.0328 ms 40.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:42.8138558Z triton_convolution2d_162 0.0328 ms 40.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:42.8140037Z triton_convolution2d_158 0.0369 ms 36.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.8141512Z triton_convolution2d_157 0.0451 ms 29.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:42.8142684Z SingleProcess AUTOTUNE benchmarking takes 0.1475 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:43.3202654Z Autotune Choices Stats: 2025-09-07T10:48:43.3204746Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.015359999611973763, "best_triton_pos": 2, "best_triton_time": 0.043007999658584595, "best_triton_kernel": "triton_convolution2d_155", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:43.3275168Z AUTOTUNE convolution(4x1536x1x1, 768x1536x1x1) 2025-09-07T10:48:43.3275581Z strides: [1536, 1, 1, 1], [1536, 1, 1, 1] 2025-09-07T10:48:43.3275918Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:43.3276231Z convolution 0.0154 ms 100.0% 2025-09-07T10:48:43.3276547Z conv1x1_via_mm 0.0195 ms 78.9% 2025-09-07T10:48:43.3277450Z triton_convolution2d_155 0.0430 ms 35.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.3278949Z triton_convolution2d_154 0.0492 ms 31.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:43.3280440Z triton_convolution2d_156 0.0573 ms 26.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:43.3281926Z triton_convolution2d_153 0.0625 ms 24.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:43.3283822Z triton_convolution2d_152 0.0696 ms 22.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.3285303Z triton_convolution2d_151 0.0850 ms 18.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.3286480Z SingleProcess AUTOTUNE benchmarking takes 0.1666 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:43.8028264Z Autotune Choices Stats: 2025-09-07T10:48:43.8029908Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.010239999741315842, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_convolution2d_102", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:43.8101761Z AUTOTUNE convolution(4x256x1x1, 512x256x1x1) 2025-09-07T10:48:43.8102163Z strides: [256, 1, 1, 1], [256, 1, 1, 1] 2025-09-07T10:48:43.8102486Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:43.8102811Z convolution 0.0102 ms 100.0% 2025-09-07T10:48:43.8103727Z triton_convolution2d_102 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.8104643Z conv1x1_via_mm 0.0133 ms 76.9% 2025-09-07T10:48:43.8105526Z triton_convolution2d_100 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:43.8107013Z triton_convolution2d_101 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:43.8108500Z triton_convolution2d_99 0.0143 ms 71.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.8116827Z triton_convolution2d_103 0.0154 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:43.8118444Z triton_convolution2d_98 0.0184 ms 55.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:43.8119623Z SingleProcess AUTOTUNE benchmarking takes 0.1393 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:44.2112708Z Autotune Choices Stats: 2025-09-07T10:48:44.2114384Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_convolution2d_67", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1"} 2025-09-07T10:48:44.2188063Z AUTOTUNE convolution(4x128x1x1, 256x128x1x1) 2025-09-07T10:48:44.2188514Z strides: [128, 1, 1, 1], [128, 1, 1, 1] 2025-09-07T10:48:44.2188849Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:44.2189158Z convolution 0.0092 ms 100.0% 2025-09-07T10:48:44.2190049Z triton_convolution2d_67 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:44.2191843Z triton_convolution2d_69 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.2193338Z triton_convolution2d_66 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.2194814Z triton_convolution2d_68 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:44.2195719Z conv1x1_via_mm 0.0113 ms 81.8% 2025-09-07T10:48:44.2196582Z triton_convolution2d_70 0.0113 ms 81.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:44.2198063Z triton_convolution2d_65 0.0123 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.2199230Z SingleProcess AUTOTUNE benchmarking takes 0.1363 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:44.6310980Z Autotune Choices Stats: 2025-09-07T10:48:44.6312642Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 2, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_convolution2d_96", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:44.6386767Z AUTOTUNE convolution(4x512x1x1, 256x512x1x1) 2025-09-07T10:48:44.6387162Z strides: [512, 1, 1, 1], [512, 1, 1, 1] 2025-09-07T10:48:44.6387485Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:44.6387792Z convolution 0.0113 ms 100.0% 2025-09-07T10:48:44.6388082Z conv1x1_via_mm 0.0133 ms 84.6% 2025-09-07T10:48:44.6389214Z triton_convolution2d_96 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.6390707Z triton_convolution2d_95 0.0205 ms 55.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:44.6392194Z triton_convolution2d_93 0.0215 ms 52.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.6393668Z triton_convolution2d_94 0.0215 ms 52.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:44.6395133Z triton_convolution2d_97 0.0225 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:44.6396612Z triton_convolution2d_92 0.0307 ms 36.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:44.6397781Z SingleProcess AUTOTUNE benchmarking takes 0.1427 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:48:45.0339417Z Autotune Choices Stats: 2025-09-07T10:48:45.0341074Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "convolution", "best_time": 0.010239999741315842, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_convolution2d_62", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1"} 2025-09-07T10:48:45.0413488Z AUTOTUNE convolution(4x256x1x1, 128x256x1x1) 2025-09-07T10:48:45.0413888Z strides: [256, 1, 1, 1], [256, 1, 1, 1] 2025-09-07T10:48:45.0414218Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:45.0414543Z convolution 0.0102 ms 100.0% 2025-09-07T10:48:45.0415437Z triton_convolution2d_62 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:48:45.0416911Z triton_convolution2d_64 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:45.0417859Z conv1x1_via_mm 0.0133 ms 76.9% 2025-09-07T10:48:45.0418739Z triton_convolution2d_61 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:45.0420216Z triton_convolution2d_63 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:45.0421694Z triton_convolution2d_60 0.0154 ms 66.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:45.0422871Z SingleProcess AUTOTUNE benchmarking takes 0.1238 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:48:46.1631424Z Autotune Choices Stats: 2025-09-07T10:48:46.1632780Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_148", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.02969600073993206, "best_triton_pos": 0} 2025-09-07T10:48:46.1710014Z AUTOTUNE convolution(4x768x12x12, 1536x768x1x1) 2025-09-07T10:48:46.1710465Z strides: [110592, 144, 12, 1], [768, 1, 1, 1] 2025-09-07T10:48:46.1710822Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:46.1711755Z triton_convolution2d_148 0.0297 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.1713253Z triton_convolution2d_147 0.0399 ms 74.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.1714759Z triton_convolution2d_149 0.0420 ms 70.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.1716252Z triton_convolution2d_150 0.0420 ms 70.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.1717741Z triton_convolution2d_144 0.0522 ms 56.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.1719227Z triton_convolution2d_145 0.0522 ms 56.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.1720310Z convolution 0.0553 ms 53.7% 2025-09-07T10:48:46.1721194Z triton_convolution2d_146 0.0707 ms 42.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:46.1722102Z conv1x1_via_mm 0.0768 ms 38.7% 2025-09-07T10:48:46.1722679Z SingleProcess AUTOTUNE benchmarking takes 0.1771 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:46.3587855Z Autotune Choices Stats: 2025-09-07T10:48:46.3589520Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.02457600086927414, "best_triton_pos": 1, "best_triton_time": 0.02969600073993206, "best_triton_kernel": "triton_convolution2d_311", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:46.3664101Z AUTOTUNE convolution(4x768x6x6, 1536x768x1x1) 2025-09-07T10:48:46.3664529Z strides: [27648, 36, 6, 1], [768, 1, 1, 1] 2025-09-07T10:48:46.3664855Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:46.3665180Z convolution 0.0246 ms 100.0% 2025-09-07T10:48:46.3666098Z triton_convolution2d_311 0.0297 ms 82.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.3667588Z triton_convolution2d_312 0.0389 ms 63.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.3669084Z triton_convolution2d_310 0.0430 ms 57.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.3670588Z triton_convolution2d_313 0.0440 ms 55.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.3671498Z conv1x1_via_mm 0.0451 ms 54.5% 2025-09-07T10:48:46.3672694Z triton_convolution2d_309 0.0451 ms 54.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:46.3674183Z triton_convolution2d_308 0.0471 ms 52.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.3675656Z triton_convolution2d_307 0.0553 ms 44.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.3676840Z SingleProcess AUTOTUNE benchmarking takes 0.1702 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:46.5720115Z Autotune Choices Stats: 2025-09-07T10:48:46.5721776Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03481600061058998, "best_triton_pos": 1, "best_triton_time": 0.058368001133203506, "best_triton_kernel": "triton_convolution2d_382", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:48:46.5795052Z AUTOTUNE convolution(4x1536x6x6, 3072x1536x1x1) 2025-09-07T10:48:46.5795453Z strides: [55296, 36, 6, 1], [1536, 1, 1, 1] 2025-09-07T10:48:46.5795783Z dtypes: torch.float16, torch.float16 2025-09-07T10:48:46.5796379Z convolution 0.0348 ms 100.0% 2025-09-07T10:48:46.5797268Z triton_convolution2d_382 0.0584 ms 59.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.5798776Z triton_convolution2d_383 0.0717 ms 48.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.5800284Z triton_convolution2d_381 0.0799 ms 43.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.5801774Z triton_convolution2d_384 0.0809 ms 43.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:48:46.5803363Z triton_convolution2d_380 0.0850 ms 41.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:48:46.5804845Z triton_convolution2d_379 0.0922 ms 37.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.5806341Z triton_convolution2d_378 0.1044 ms 33.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:48:46.5807246Z conv1x1_via_mm 0.1198 ms 29.1% 2025-09-07T10:48:46.5807816Z SingleProcess AUTOTUNE benchmarking takes 0.1991 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:48:46.9095535Z Autotune Choices Stats: 2025-09-07T10:48:46.9096737Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_389", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:48:46.9174014Z AUTOTUNE addmm(4x1000, 4x3072, 3072x1000) 2025-09-07T10:48:46.9174743Z strides: [0, 1], [3072, 1], [1, 3072] 2025-09-07T10:48:46.9175114Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:48:46.9175937Z triton_mm_389 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:48:46.9177142Z triton_mm_393 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:48:46.9177966Z bias_addmm 0.0205 ms 80.0% 2025-09-07T10:48:46.9178684Z triton_mm_397 0.0225 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:48:46.9179871Z triton_mm_401 0.0246 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:48:46.9180635Z addmm 0.0266 ms 61.5% 2025-09-07T10:48:46.9181331Z triton_mm_388 0.0276 ms 59.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:48:46.9182501Z triton_mm_387 0.0297 ms 55.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:48:46.9183678Z triton_mm_392 0.0297 ms 55.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:48:46.9185021Z triton_mm_386 0.0307 ms 53.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:48:46.9186041Z SingleProcess AUTOTUNE benchmarking takes 0.3362 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:49:04.1447941Z Autotune Choices Stats: 2025-09-07T10:49:04.1449189Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_426", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:49:04.1529222Z AUTOTUNE mm(1000x4, 4x3072) 2025-09-07T10:49:04.1529585Z strides: [1, 1000], [3072, 1] 2025-09-07T10:49:04.1529893Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:04.1530689Z triton_mm_426 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:49:04.1531907Z triton_mm_419 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:49:04.1533105Z triton_mm_420 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:49:04.1534291Z triton_mm_421 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:49:04.1535472Z triton_mm_423 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:49:04.1536658Z triton_mm_424 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:49:04.1537895Z triton_mm_425 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:49:04.1539442Z triton_mm_427 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:49:04.1540654Z triton_mm_428 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:49:04.1541850Z triton_mm_429 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:49:04.1542902Z SingleProcess AUTOTUNE benchmarking takes 0.2218 seconds and 0.0004 seconds precompiling for 17 choices 2025-09-07T10:49:04.9276125Z Autotune Choices Stats: 2025-09-07T10:49:04.9277362Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_410", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T10:49:04.9358444Z AUTOTUNE mm(4x1000, 1000x3072) 2025-09-07T10:49:04.9358799Z strides: [1000, 1], [3072, 1] 2025-09-07T10:49:04.9359104Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:04.9359884Z triton_mm_410 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:49:04.9361470Z triton_mm_414 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:49:04.9362696Z triton_mm_418 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:49:04.9364103Z triton_mm_404 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:49:04.9365301Z triton_mm_406 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:49:04.9366053Z mm 0.0164 ms 87.5% 2025-09-07T10:49:04.9366730Z triton_mm_409 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:49:04.9367932Z triton_mm_413 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:49:04.9369120Z triton_mm_405 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:49:04.9370303Z triton_mm_416 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:49:04.9371330Z SingleProcess AUTOTUNE benchmarking takes 0.2529 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:49:13.3863056Z pass 2025-09-07T10:49:18.0861617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:49:18.0863041Z import pynvml # type: ignore[import] 2025-09-07T10:49:20.9023842Z 2025-09-07T10:49:24.6435310Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:49:24.6435706Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:49:24.6436024Z cuda train timm_regnet 2025-09-07T10:49:50.7021003Z Autotune Choices Stats: 2025-09-07T10:49:50.7022382Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_10", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.02457600086927414, "best_triton_pos": 0} 2025-09-07T10:49:50.7098297Z AUTOTUNE convolution(4x32x112x112, 224x32x1x1) 2025-09-07T10:49:50.7098683Z strides: [401408, 12544, 112, 1], [32, 1, 1, 1] 2025-09-07T10:49:50.7099056Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:50.7099956Z triton_convolution2d_10 0.0246 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:50.7101465Z triton_convolution2d_9 0.0256 ms 96.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:50.7102955Z triton_convolution2d_12 0.0266 ms 92.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:50.7104430Z triton_convolution2d_11 0.0276 ms 88.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:50.7106130Z triton_convolution2d_7 0.0287 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:50.7107047Z convolution 0.0297 ms 82.8% 2025-09-07T10:49:50.7107923Z triton_convolution2d_6 0.0317 ms 77.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:50.7109406Z triton_convolution2d_8 0.0379 ms 64.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:50.7110295Z conv1x1_via_mm 0.1935 ms 12.7% 2025-09-07T10:49:50.7110866Z SingleProcess AUTOTUNE benchmarking takes 0.1655 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T10:49:51.2168940Z Autotune Choices Stats: 2025-09-07T10:49:51.2170283Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_61", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.03174399957060814, "best_triton_pos": 0} 2025-09-07T10:49:51.2246431Z AUTOTUNE convolution(4x224x56x56, 448x224x1x1) 2025-09-07T10:49:51.2246817Z strides: [702464, 3136, 56, 1], [224, 1, 1, 1] 2025-09-07T10:49:51.2247165Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:51.2248073Z triton_convolution2d_61 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.2249565Z triton_convolution2d_62 0.0338 ms 93.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.2251054Z triton_convolution2d_63 0.0338 ms 93.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.2252858Z triton_convolution2d_64 0.0338 ms 93.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.2254355Z triton_convolution2d_59 0.0369 ms 86.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.2255842Z triton_convolution2d_58 0.0430 ms 73.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.2256761Z convolution 0.0543 ms 58.5% 2025-09-07T10:49:51.2257737Z triton_convolution2d_60 0.0594 ms 53.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:51.2258644Z conv1x1_via_mm 0.1382 ms 23.0% 2025-09-07T10:49:51.2259231Z SingleProcess AUTOTUNE benchmarking takes 0.1716 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:51.6434837Z Autotune Choices Stats: 2025-09-07T10:49:51.6436206Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_31", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:49:51.6513704Z AUTOTUNE convolution(4x32x112x112, 224x32x1x1) 2025-09-07T10:49:51.6514120Z strides: [401408, 12544, 112, 1], [32, 1, 1, 1] 2025-09-07T10:49:51.6514479Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:51.6515434Z triton_convolution2d_31 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.6516946Z triton_convolution2d_32 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.6518440Z triton_convolution2d_33 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.6519938Z triton_convolution2d_28 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.6521403Z triton_convolution2d_29 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:51.6523099Z triton_convolution2d_34 0.0143 ms 92.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:51.6524594Z triton_convolution2d_30 0.0287 ms 46.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:51.6525506Z convolution 0.0317 ms 41.9% 2025-09-07T10:49:51.6526077Z SingleProcess AUTOTUNE benchmarking takes 0.1223 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:49:52.2259077Z Autotune Choices Stats: 2025-09-07T10:49:52.2260861Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_38", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.021503999829292297, "best_triton_pos": 0} 2025-09-07T10:49:52.2337783Z AUTOTUNE convolution(4x224x56x56, 224x224x1x1) 2025-09-07T10:49:52.2338199Z strides: [702464, 3136, 56, 1], [224, 1, 1, 1] 2025-09-07T10:49:52.2338553Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:52.2339495Z triton_convolution2d_38 0.0215 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.2341036Z triton_convolution2d_40 0.0225 ms 95.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.2342537Z triton_convolution2d_41 0.0225 ms 95.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.2344019Z triton_convolution2d_36 0.0246 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.2345499Z triton_convolution2d_39 0.0246 ms 87.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.2347263Z triton_convolution2d_35 0.0276 ms 77.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.2348753Z triton_convolution2d_37 0.0328 ms 65.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:52.2349658Z convolution 0.0399 ms 53.8% 2025-09-07T10:49:52.2349958Z conv1x1_via_mm 0.0983 ms 21.9% 2025-09-07T10:49:52.2350530Z SingleProcess AUTOTUNE benchmarking takes 0.1620 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:52.9905585Z Autotune Choices Stats: 2025-09-07T10:49:52.9906974Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_191", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.03174399957060814, "best_triton_pos": 0} 2025-09-07T10:49:52.9984670Z AUTOTUNE convolution(4x448x28x28, 896x448x1x1) 2025-09-07T10:49:52.9985060Z strides: [351232, 784, 28, 1], [448, 1, 1, 1] 2025-09-07T10:49:52.9985399Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:52.9986338Z triton_convolution2d_191 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.9987863Z triton_convolution2d_193 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.9989362Z triton_convolution2d_194 0.0317 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:52.9990859Z triton_convolution2d_192 0.0338 ms 93.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.9992687Z triton_convolution2d_188 0.0369 ms 86.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.9994181Z triton_convolution2d_189 0.0369 ms 86.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:52.9995083Z convolution 0.0563 ms 56.4% 2025-09-07T10:49:52.9995967Z triton_convolution2d_190 0.1004 ms 31.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:52.9996893Z conv1x1_via_mm 0.1044 ms 30.4% 2025-09-07T10:49:52.9997458Z SingleProcess AUTOTUNE benchmarking takes 0.1766 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:54.4743576Z Autotune Choices Stats: 2025-09-07T10:49:54.4744977Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_484", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.04710400104522705, "best_triton_pos": 0} 2025-09-07T10:49:54.4822020Z AUTOTUNE convolution(4x896x14x14, 2240x896x1x1) 2025-09-07T10:49:54.4822455Z strides: [175616, 196, 14, 1], [896, 1, 1, 1] 2025-09-07T10:49:54.4822804Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:54.4823704Z triton_convolution2d_484 0.0471 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:54.4827430Z convolution 0.0481 ms 97.9% 2025-09-07T10:49:54.4828340Z triton_convolution2d_483 0.0563 ms 83.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:54.4829838Z triton_convolution2d_480 0.0594 ms 79.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:54.4831319Z triton_convolution2d_485 0.0686 ms 68.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:54.4832811Z triton_convolution2d_486 0.0922 ms 51.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:54.4834294Z triton_convolution2d_481 0.1004 ms 46.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:54.4835196Z conv1x1_via_mm 0.1229 ms 38.3% 2025-09-07T10:49:54.4836087Z triton_convolution2d_482 0.1516 ms 31.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:54.4837265Z SingleProcess AUTOTUNE benchmarking takes 0.2035 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:54.8944445Z Autotune Choices Stats: 2025-09-07T10:49:54.8945808Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_1", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T10:49:54.9023459Z AUTOTUNE convolution(4x3x224x224, 32x3x3x3) 2025-09-07T10:49:54.9023824Z strides: [150528, 50176, 224, 1], [27, 9, 3, 1] 2025-09-07T10:49:54.9024537Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:54.9025462Z triton_convolution2d_1 0.0143 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:49:54.9026961Z triton_convolution2d_5 0.0143 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:49:54.9028456Z triton_convolution2d_3 0.0154 ms 93.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:49:54.9029935Z triton_convolution2d_0 0.0164 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:49:54.9031392Z triton_convolution2d_4 0.0174 ms 82.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:49:54.9032297Z convolution 0.0215 ms 66.7% 2025-09-07T10:49:54.9033177Z triton_convolution2d_2 0.0215 ms 66.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:49:54.9034512Z SingleProcess AUTOTUNE benchmarking takes 0.1068 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:49:55.3240696Z Autotune Choices Stats: 2025-09-07T10:49:55.3242051Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_85", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.01945599913597107, "best_triton_pos": 0} 2025-09-07T10:49:55.3320920Z AUTOTUNE convolution(4x224x56x56, 448x224x1x1) 2025-09-07T10:49:55.3321317Z strides: [702464, 3136, 56, 1], [224, 1, 1, 1] 2025-09-07T10:49:55.3321669Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:55.3322587Z triton_convolution2d_85 0.0195 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.3324201Z triton_convolution2d_86 0.0195 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.3325699Z triton_convolution2d_84 0.0205 ms 95.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.3327190Z triton_convolution2d_81 0.0246 ms 79.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.3328682Z triton_convolution2d_87 0.0256 ms 76.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.3330168Z triton_convolution2d_82 0.0276 ms 70.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.3331069Z convolution 0.0328 ms 59.4% 2025-09-07T10:49:55.3332328Z triton_convolution2d_83 0.0788 ms 24.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:55.3333630Z SingleProcess AUTOTUNE benchmarking takes 0.1336 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:49:55.9133069Z Autotune Choices Stats: 2025-09-07T10:49:55.9134771Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_convolution2d_92", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:49:55.9212416Z AUTOTUNE convolution(4x448x28x28, 448x448x1x1) 2025-09-07T10:49:55.9212827Z strides: [351232, 784, 28, 1], [448, 1, 1, 1] 2025-09-07T10:49:55.9213166Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:55.9213494Z convolution 0.0184 ms 100.0% 2025-09-07T10:49:55.9214384Z triton_convolution2d_92 0.0236 ms 78.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.9215867Z triton_convolution2d_91 0.0287 ms 64.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.9217446Z triton_convolution2d_93 0.0287 ms 64.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.9219252Z triton_convolution2d_94 0.0287 ms 64.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:55.9220743Z triton_convolution2d_88 0.0328 ms 56.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.9222210Z triton_convolution2d_89 0.0338 ms 54.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:55.9223695Z triton_convolution2d_90 0.0696 ms 26.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:55.9224611Z conv1x1_via_mm 0.0748 ms 24.7% 2025-09-07T10:49:55.9225188Z SingleProcess AUTOTUNE benchmarking takes 0.1672 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:56.8631026Z Autotune Choices Stats: 2025-09-07T10:49:56.8632415Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_217", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.023552000522613525, "best_triton_pos": 0} 2025-09-07T10:49:56.8714347Z AUTOTUNE convolution(4x448x28x28, 896x448x1x1) 2025-09-07T10:49:56.8714727Z strides: [351232, 784, 28, 1], [448, 1, 1, 1] 2025-09-07T10:49:56.8715072Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:56.8715989Z triton_convolution2d_217 0.0236 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:56.8716901Z convolution 0.0276 ms 85.2% 2025-09-07T10:49:56.8718109Z triton_convolution2d_218 0.0287 ms 82.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:56.8719617Z triton_convolution2d_216 0.0307 ms 76.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:56.8721119Z triton_convolution2d_213 0.0379 ms 62.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:56.8722620Z triton_convolution2d_219 0.0410 ms 57.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:56.8724322Z triton_convolution2d_214 0.0451 ms 52.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:56.8725807Z triton_convolution2d_215 0.0768 ms 30.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:56.8726991Z SingleProcess AUTOTUNE benchmarking takes 0.1373 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:49:57.4963701Z Autotune Choices Stats: 2025-09-07T10:49:57.4965397Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.028672000393271446, "best_triton_pos": 1, "best_triton_time": 0.03379200026392937, "best_triton_kernel": "triton_convolution2d_224", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:49:57.5042654Z AUTOTUNE convolution(4x896x14x14, 896x896x1x1) 2025-09-07T10:49:57.5043152Z strides: [175616, 196, 14, 1], [896, 1, 1, 1] 2025-09-07T10:49:57.5043508Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:57.5043823Z convolution 0.0287 ms 100.0% 2025-09-07T10:49:57.5044724Z triton_convolution2d_224 0.0338 ms 84.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:57.5046212Z triton_convolution2d_225 0.0420 ms 68.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:57.5047720Z triton_convolution2d_223 0.0481 ms 59.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:57.5049227Z triton_convolution2d_226 0.0492 ms 58.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:57.5050718Z triton_convolution2d_220 0.0563 ms 50.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:57.5051628Z conv1x1_via_mm 0.0686 ms 41.8% 2025-09-07T10:49:57.5052500Z triton_convolution2d_221 0.0686 ms 41.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:57.5053999Z triton_convolution2d_222 0.0850 ms 33.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:49:57.5055444Z SingleProcess AUTOTUNE benchmarking takes 0.1842 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:49:58.7885269Z Autotune Choices Stats: 2025-09-07T10:49:58.7886997Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_convolution2d_497", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:49:58.7964404Z AUTOTUNE convolution(4x224x1x1, 2240x224x1x1) 2025-09-07T10:49:58.7964805Z strides: [224, 1, 1, 1], [224, 1, 1, 1] 2025-09-07T10:49:58.7965141Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:58.7965454Z convolution 0.0113 ms 100.0% 2025-09-07T10:49:58.7966387Z triton_convolution2d_497 0.0123 ms 91.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:58.7967299Z conv1x1_via_mm 0.0133 ms 84.6% 2025-09-07T10:49:58.7968181Z triton_convolution2d_495 0.0133 ms 84.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:49:58.7969668Z triton_convolution2d_496 0.0133 ms 84.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:58.7971510Z triton_convolution2d_494 0.0143 ms 78.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:58.7973010Z triton_convolution2d_498 0.0143 ms 78.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:58.7974485Z triton_convolution2d_493 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:58.7975656Z SingleProcess AUTOTUNE benchmarking takes 0.1383 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:49:59.2110208Z Autotune Choices Stats: 2025-09-07T10:49:59.2111885Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_convolution2d_203", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:49:59.2190346Z AUTOTUNE convolution(4x112x1x1, 896x112x1x1) 2025-09-07T10:49:59.2190743Z strides: [112, 1, 1, 1], [112, 1, 1, 1] 2025-09-07T10:49:59.2191076Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:59.2191401Z convolution 0.0092 ms 100.0% 2025-09-07T10:49:59.2192304Z triton_convolution2d_203 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:59.2193822Z triton_convolution2d_204 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.2195304Z triton_convolution2d_201 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.2197107Z triton_convolution2d_202 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:49:59.2198598Z triton_convolution2d_205 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:59.2199511Z conv1x1_via_mm 0.0123 ms 75.0% 2025-09-07T10:49:59.2200413Z triton_convolution2d_200 0.0123 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.2201597Z SingleProcess AUTOTUNE benchmarking takes 0.1377 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:49:59.6353149Z Autotune Choices Stats: 2025-09-07T10:49:59.6354870Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.010239999741315842, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_convolution2d_237", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:49:59.6434858Z AUTOTUNE convolution(4x224x1x1, 896x224x1x1) 2025-09-07T10:49:59.6435222Z strides: [224, 1, 1, 1], [224, 1, 1, 1] 2025-09-07T10:49:59.6435874Z dtypes: torch.float16, torch.float16 2025-09-07T10:49:59.6436200Z convolution 0.0102 ms 100.0% 2025-09-07T10:49:59.6437092Z triton_convolution2d_237 0.0113 ms 90.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.6438611Z triton_convolution2d_236 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:59.6440086Z triton_convolution2d_235 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:49:59.6440987Z conv1x1_via_mm 0.0143 ms 71.4% 2025-09-07T10:49:59.6441866Z triton_convolution2d_234 0.0143 ms 71.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.6443586Z triton_convolution2d_238 0.0143 ms 71.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:49:59.6445085Z triton_convolution2d_233 0.0174 ms 58.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:49:59.6446262Z SingleProcess AUTOTUNE benchmarking takes 0.1411 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:50:00.0406341Z Autotune Choices Stats: 2025-09-07T10:50:00.0407644Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_72", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:50:00.0485690Z AUTOTUNE convolution(4x56x1x1, 448x56x1x1) 2025-09-07T10:50:00.0486082Z strides: [56, 1, 1, 1], [56, 1, 1, 1] 2025-09-07T10:50:00.0486403Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:00.0487660Z triton_convolution2d_72 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.0488585Z convolution 0.0082 ms 87.5% 2025-09-07T10:50:00.0489463Z triton_convolution2d_69 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.0490960Z triton_convolution2d_70 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:00.0492432Z triton_convolution2d_71 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.0493928Z triton_convolution2d_73 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.0495404Z triton_convolution2d_68 0.0102 ms 70.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.0496312Z conv1x1_via_mm 0.0113 ms 63.6% 2025-09-07T10:50:00.0497061Z SingleProcess AUTOTUNE benchmarking takes 0.1374 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:50:00.4583031Z Autotune Choices Stats: 2025-09-07T10:50:00.4584767Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_convolution2d_103", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:50:00.4663258Z AUTOTUNE convolution(4x112x1x1, 448x112x1x1) 2025-09-07T10:50:00.4663626Z strides: [112, 1, 1, 1], [112, 1, 1, 1] 2025-09-07T10:50:00.4663958Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:00.4664277Z convolution 0.0092 ms 100.0% 2025-09-07T10:50:00.4665163Z triton_convolution2d_103 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.4666693Z triton_convolution2d_104 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.4668192Z triton_convolution2d_101 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.4669674Z triton_convolution2d_102 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:00.4671164Z triton_convolution2d_105 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.4672075Z conv1x1_via_mm 0.0123 ms 75.0% 2025-09-07T10:50:00.4672961Z triton_convolution2d_100 0.0123 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.4674468Z SingleProcess AUTOTUNE benchmarking takes 0.1376 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:50:00.7504359Z Autotune Choices Stats: 2025-09-07T10:50:00.7505714Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "triton_convolution2d_17", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.006144000217318535, "best_triton_pos": 0} 2025-09-07T10:50:00.7584981Z AUTOTUNE convolution(4x8x1x1, 224x8x1x1) 2025-09-07T10:50:00.7585355Z strides: [8, 1, 1, 1], [8, 1, 1, 1] 2025-09-07T10:50:00.7585674Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:00.7586588Z triton_convolution2d_17 0.0061 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.7588095Z triton_convolution2d_16 0.0072 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:00.7589572Z triton_convolution2d_18 0.0072 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:00.7591042Z triton_convolution2d_19 0.0072 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.7592817Z triton_convolution2d_20 0.0072 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:00.7593709Z convolution 0.0092 ms 66.7% 2025-09-07T10:50:00.7594014Z conv1x1_via_mm 0.0113 ms 54.5% 2025-09-07T10:50:00.7594590Z SingleProcess AUTOTUNE benchmarking takes 0.1238 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:50:01.1887838Z Autotune Choices Stats: 2025-09-07T10:50:01.1889207Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_49", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:50:01.2046573Z AUTOTUNE convolution(4x56x1x1, 224x56x1x1) 2025-09-07T10:50:01.2047258Z strides: [56, 1, 1, 1], [56, 1, 1, 1] 2025-09-07T10:50:01.2047793Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:01.2049498Z triton_convolution2d_49 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.2051194Z convolution 0.0082 ms 87.5% 2025-09-07T10:50:01.2052796Z triton_convolution2d_46 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.2055551Z triton_convolution2d_47 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:01.2058467Z triton_convolution2d_48 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:01.2061688Z triton_convolution2d_45 0.0092 ms 77.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.2063417Z conv1x1_via_mm 0.0113 ms 63.6% 2025-09-07T10:50:01.2065062Z triton_convolution2d_50 0.0266 ms 26.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:01.2067204Z SingleProcess AUTOTUNE benchmarking takes 0.1651 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:50:01.5213814Z Autotune Choices Stats: 2025-09-07T10:50:01.5215488Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "conv1x1_via_mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.026623999699950218, "best_triton_kernel": "triton_convolution2d_231", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:01.5365024Z AUTOTUNE convolution(4x896x1x1, 224x896x1x1) 2025-09-07T10:50:01.5365419Z strides: [896, 1, 1, 1], [896, 1, 1, 1] 2025-09-07T10:50:01.5365763Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:01.5366071Z conv1x1_via_mm 0.0143 ms 100.0% 2025-09-07T10:50:01.5366973Z triton_convolution2d_231 0.0266 ms 53.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.5368463Z triton_convolution2d_230 0.0297 ms 48.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:01.5370287Z triton_convolution2d_229 0.0338 ms 42.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:01.5371778Z triton_convolution2d_232 0.0379 ms 37.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:01.5373268Z triton_convolution2d_227 0.0471 ms 30.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.5374745Z triton_convolution2d_228 0.0471 ms 30.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:01.5375639Z convolution 0.0881 ms 16.3% 2025-09-07T10:50:01.5376198Z SingleProcess AUTOTUNE benchmarking takes 0.1851 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:50:02.0969166Z Autotune Choices Stats: 2025-09-07T10:50:02.0970853Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.01740800030529499, "best_triton_pos": 2, "best_triton_time": 0.058368001133203506, "best_triton_kernel": "triton_convolution2d_491", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:02.1050917Z AUTOTUNE convolution(4x2240x1x1, 224x2240x1x1) 2025-09-07T10:50:02.1051284Z strides: [2240, 1, 1, 1], [2240, 1, 1, 1] 2025-09-07T10:50:02.1051641Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:02.1051964Z convolution 0.0174 ms 100.0% 2025-09-07T10:50:02.1052261Z conv1x1_via_mm 0.0174 ms 100.0% 2025-09-07T10:50:02.1053141Z triton_convolution2d_491 0.0584 ms 29.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.1054969Z triton_convolution2d_490 0.0655 ms 26.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:02.1056469Z triton_convolution2d_492 0.0788 ms 22.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:02.1058037Z triton_convolution2d_489 0.0829 ms 21.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:02.1059523Z triton_convolution2d_488 0.0881 ms 19.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.1061020Z triton_convolution2d_487 0.1075 ms 16.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.1062200Z SingleProcess AUTOTUNE benchmarking takes 0.1803 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:50:02.2315333Z Autotune Choices Stats: 2025-09-07T10:50:02.2316941Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 2, "best_triton_time": 0.015359999611973763, "best_triton_kernel": "triton_convolution2d_99", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:02.2394473Z AUTOTUNE convolution(4x448x1x1, 112x448x1x1) 2025-09-07T10:50:02.2394868Z strides: [448, 1, 1, 1], [448, 1, 1, 1] 2025-09-07T10:50:02.2395220Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:02.2395542Z convolution 0.0113 ms 100.0% 2025-09-07T10:50:02.2395823Z conv1x1_via_mm 0.0133 ms 84.6% 2025-09-07T10:50:02.2396724Z triton_convolution2d_99 0.0154 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.2398217Z triton_convolution2d_97 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:02.2399702Z triton_convolution2d_98 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:02.2401182Z triton_convolution2d_96 0.0195 ms 57.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.2402657Z triton_convolution2d_95 0.0215 ms 52.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.2404014Z SingleProcess AUTOTUNE benchmarking takes 0.1264 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:50:02.6631682Z Autotune Choices Stats: 2025-09-07T10:50:02.6633410Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "convolution", "best_time": 0.013311999849975109, "best_triton_pos": 2, "best_triton_time": 0.026623999699950218, "best_triton_kernel": "triton_convolution2d_199", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:02.6717679Z AUTOTUNE convolution(4x896x1x1, 112x896x1x1) 2025-09-07T10:50:02.6718045Z strides: [896, 1, 1, 1], [896, 1, 1, 1] 2025-09-07T10:50:02.6718376Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:02.6718703Z convolution 0.0133 ms 100.0% 2025-09-07T10:50:02.6718985Z conv1x1_via_mm 0.0143 ms 92.9% 2025-09-07T10:50:02.6719874Z triton_convolution2d_199 0.0266 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.6721389Z triton_convolution2d_198 0.0297 ms 44.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:02.6723084Z triton_convolution2d_197 0.0307 ms 43.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:02.6724559Z triton_convolution2d_196 0.0348 ms 38.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.6726037Z triton_convolution2d_195 0.0369 ms 36.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.6727385Z SingleProcess AUTOTUNE benchmarking takes 0.1357 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:50:02.7688767Z Autotune Choices Stats: 2025-09-07T10:50:02.7690445Z {"num_choices": 5, "num_triton_choices": 3, "best_kernel": "convolution", "best_time": 0.010239999741315842, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_convolution2d_43", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1"} 2025-09-07T10:50:02.7767549Z AUTOTUNE convolution(4x224x1x1, 56x224x1x1) 2025-09-07T10:50:02.7767940Z strides: [224, 1, 1, 1], [224, 1, 1, 1] 2025-09-07T10:50:02.7768272Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:02.7768582Z convolution 0.0102 ms 100.0% 2025-09-07T10:50:02.7769468Z triton_convolution2d_43 0.0113 ms 90.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:02.7770961Z triton_convolution2d_44 0.0113 ms 90.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.7772442Z triton_convolution2d_42 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:02.7773342Z conv1x1_via_mm 0.0133 ms 76.9% 2025-09-07T10:50:02.7773916Z SingleProcess AUTOTUNE benchmarking takes 0.0981 seconds and 0.0002 seconds precompiling for 5 choices 2025-09-07T10:50:03.1577712Z Autotune Choices Stats: 2025-09-07T10:50:03.1579406Z {"num_choices": 5, "num_triton_choices": 3, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 2, "best_triton_time": 0.015359999611973763, "best_triton_kernel": "triton_convolution2d_67", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:03.1659103Z AUTOTUNE convolution(4x448x1x1, 56x448x1x1) 2025-09-07T10:50:03.1659447Z strides: [448, 1, 1, 1], [448, 1, 1, 1] 2025-09-07T10:50:03.1660134Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:03.1660461Z convolution 0.0113 ms 100.0% 2025-09-07T10:50:03.1660754Z conv1x1_via_mm 0.0133 ms 84.6% 2025-09-07T10:50:03.1661622Z triton_convolution2d_67 0.0154 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:03.1663102Z triton_convolution2d_66 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:03.1664578Z triton_convolution2d_65 0.0195 ms 57.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:03.1665749Z SingleProcess AUTOTUNE benchmarking takes 0.1020 seconds and 0.0002 seconds precompiling for 5 choices 2025-09-07T10:50:03.2640782Z Autotune Choices Stats: 2025-09-07T10:50:03.2642096Z {"num_choices": 5, "num_triton_choices": 3, "best_kernel": "triton_convolution2d_15", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=1", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:50:03.2723693Z AUTOTUNE convolution(4x224x1x1, 8x224x1x1) 2025-09-07T10:50:03.2726612Z strides: [224, 1, 1, 1], [224, 1, 1, 1] 2025-09-07T10:50:03.2726934Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:03.2727854Z triton_convolution2d_15 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=1 2025-09-07T10:50:03.2728768Z convolution 0.0102 ms 90.0% 2025-09-07T10:50:03.2729649Z triton_convolution2d_14 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:50:03.2731117Z triton_convolution2d_13 0.0113 ms 81.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=1 2025-09-07T10:50:03.2732012Z conv1x1_via_mm 0.0123 ms 75.0% 2025-09-07T10:50:03.2732577Z SingleProcess AUTOTUNE benchmarking takes 0.0989 seconds and 0.0002 seconds precompiling for 5 choices 2025-09-07T10:50:05.0673844Z Autotune Choices Stats: 2025-09-07T10:50:05.0675568Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03993599861860275, "best_triton_pos": 1, "best_triton_time": 0.07782399654388428, "best_triton_kernel": "triton_convolution2d_503", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:05.0759265Z AUTOTUNE convolution(4x2240x7x7, 2240x2240x1x1) 2025-09-07T10:50:05.0759695Z strides: [109760, 49, 7, 1], [2240, 1, 1, 1] 2025-09-07T10:50:05.0760045Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:05.0760374Z convolution 0.0399 ms 100.0% 2025-09-07T10:50:05.0761254Z triton_convolution2d_503 0.0778 ms 51.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.0762769Z triton_convolution2d_504 0.0993 ms 40.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.0764815Z triton_convolution2d_502 0.1044 ms 38.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.0765755Z conv1x1_via_mm 0.1106 ms 36.1% 2025-09-07T10:50:05.0766652Z triton_convolution2d_499 0.1341 ms 29.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.0768137Z triton_convolution2d_505 0.1362 ms 29.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.0769617Z triton_convolution2d_501 0.1516 ms 26.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:50:05.0771104Z triton_convolution2d_500 0.1812 ms 22.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.0772281Z SingleProcess AUTOTUNE benchmarking takes 0.2414 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:50:05.2327086Z Autotune Choices Stats: 2025-09-07T10:50:05.2328743Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.028672000393271446, "best_triton_pos": 1, "best_triton_time": 0.03788800165057182, "best_triton_kernel": "triton_convolution2d_510", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:50:05.2411099Z AUTOTUNE convolution(4x896x14x14, 2240x896x1x1) 2025-09-07T10:50:05.2411511Z strides: [175616, 196, 14, 1], [896, 1, 1, 1] 2025-09-07T10:50:05.2411847Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:05.2412181Z convolution 0.0287 ms 100.0% 2025-09-07T10:50:05.2413072Z triton_convolution2d_510 0.0379 ms 75.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.2414564Z triton_convolution2d_511 0.0492 ms 58.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.2416074Z triton_convolution2d_509 0.0522 ms 54.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.2417643Z triton_convolution2d_506 0.0625 ms 45.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.2419136Z triton_convolution2d_512 0.0696 ms 41.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:50:05.2420613Z triton_convolution2d_507 0.0829 ms 34.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:50:05.2422112Z triton_convolution2d_508 0.1874 ms 15.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:50:05.2423292Z SingleProcess AUTOTUNE benchmarking takes 0.1637 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:50:05.5492369Z Autotune Choices Stats: 2025-09-07T10:50:05.5493856Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_517", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:50:05.5579293Z AUTOTUNE addmm(4x1000, 4x2240, 2240x1000) 2025-09-07T10:50:05.5579644Z strides: [0, 1], [2240, 1], [1, 2240] 2025-09-07T10:50:05.5580009Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:50:05.5580837Z triton_mm_517 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:50:05.5582071Z triton_mm_521 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:50:05.5583282Z triton_mm_525 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:50:05.5584060Z bias_addmm 0.0195 ms 78.9% 2025-09-07T10:50:05.5584772Z triton_mm_529 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:50:05.5585530Z addmm 0.0215 ms 71.4% 2025-09-07T10:50:05.5586239Z triton_mm_516 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:50:05.5587607Z triton_mm_515 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:50:05.5588793Z triton_mm_520 0.0236 ms 65.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:50:05.5589993Z triton_mm_514 0.0256 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:50:05.5591009Z SingleProcess AUTOTUNE benchmarking takes 0.3154 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:50:19.7745486Z Autotune Choices Stats: 2025-09-07T10:50:19.7748354Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_552", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:50:19.7833307Z AUTOTUNE mm(1000x4, 4x2240) 2025-09-07T10:50:19.7833955Z strides: [1, 1000], [2240, 1] 2025-09-07T10:50:19.7834622Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:19.7836486Z triton_mm_552 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:50:19.7839401Z triton_mm_553 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:50:19.7842285Z triton_mm_548 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:50:19.7845538Z triton_mm_549 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:50:19.7848425Z triton_mm_554 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:50:19.7851859Z triton_mm_555 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:50:19.7854772Z triton_mm_556 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:50:19.7857706Z triton_mm_557 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:50:19.7860619Z triton_mm_558 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:50:19.7863517Z triton_mm_559 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:50:19.7866033Z SingleProcess AUTOTUNE benchmarking takes 0.2708 seconds and 0.0005 seconds precompiling for 17 choices 2025-09-07T10:50:20.5831047Z Autotune Choices Stats: 2025-09-07T10:50:20.5832283Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_538", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:50:20.5917337Z AUTOTUNE mm(4x1000, 1000x2240) 2025-09-07T10:50:20.5917707Z strides: [1000, 1], [2240, 1] 2025-09-07T10:50:20.5918017Z dtypes: torch.float16, torch.float16 2025-09-07T10:50:20.5918800Z triton_mm_538 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:50:20.5920057Z triton_mm_542 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:50:20.5921268Z triton_mm_546 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:50:20.5922025Z mm 0.0143 ms 92.9% 2025-09-07T10:50:20.5922720Z triton_mm_532 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:50:20.5924109Z triton_mm_534 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:50:20.5925311Z triton_mm_537 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:50:20.5926497Z triton_mm_533 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:50:20.5927693Z triton_mm_541 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:50:20.5928883Z triton_mm_544 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:50:20.5929916Z SingleProcess AUTOTUNE benchmarking takes 0.2550 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:50:27.7239915Z W0907 10:50:27.722000 97981 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:51:09.9309792Z W0907 10:51:09.930000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9316979Z W0907 10:51:09.931000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9329106Z W0907 10:51:09.932000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9336066Z W0907 10:51:09.933000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9343306Z W0907 10:51:09.934000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9350392Z W0907 10:51:09.934000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9357297Z W0907 10:51:09.935000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9364495Z W0907 10:51:09.936000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9371458Z W0907 10:51:09.936000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9386845Z W0907 10:51:09.938000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9393644Z W0907 10:51:09.939000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9400881Z W0907 10:51:09.939000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9408004Z W0907 10:51:09.940000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9414709Z W0907 10:51:09.941000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9434760Z W0907 10:51:09.941000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9435950Z W0907 10:51:09.942000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9436958Z W0907 10:51:09.943000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9443217Z W0907 10:51:09.944000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9450125Z W0907 10:51:09.944000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9457049Z W0907 10:51:09.945000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9468385Z W0907 10:51:09.946000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9475150Z W0907 10:51:09.947000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9482616Z W0907 10:51:09.947000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9489837Z W0907 10:51:09.948000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9496472Z W0907 10:51:09.949000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9504637Z W0907 10:51:09.950000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9524444Z W0907 10:51:09.952000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:09.9530913Z W0907 10:51:09.952000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0002476Z W0907 10:51:09.999000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0009738Z W0907 10:51:10.000000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0017175Z W0907 10:51:10.001000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0024746Z W0907 10:51:10.002000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0032079Z W0907 10:51:10.002000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0039725Z W0907 10:51:10.003000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0047634Z W0907 10:51:10.004000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0054259Z W0907 10:51:10.005000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0061755Z W0907 10:51:10.005000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0071171Z W0907 10:51:10.006000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0082633Z W0907 10:51:10.007000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0090074Z W0907 10:51:10.008000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0097735Z W0907 10:51:10.009000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0105200Z W0907 10:51:10.010000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0112349Z W0907 10:51:10.010000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0119803Z W0907 10:51:10.011000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0127548Z W0907 10:51:10.012000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0134646Z W0907 10:51:10.013000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0142412Z W0907 10:51:10.013000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0150019Z W0907 10:51:10.014000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0157389Z W0907 10:51:10.015000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0169125Z W0907 10:51:10.016000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0176229Z W0907 10:51:10.017000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0184902Z W0907 10:51:10.018000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0191862Z W0907 10:51:10.018000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0198786Z W0907 10:51:10.019000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0206576Z W0907 10:51:10.020000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0226612Z W0907 10:51:10.022000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.0233899Z W0907 10:51:10.023000 97981 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T10:51:10.1345873Z pass 2025-09-07T10:51:16.3346691Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:51:16.3349288Z import pynvml # type: ignore[import] 2025-09-07T10:51:19.1080753Z 2025-09-07T10:51:21.9451233Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:51:21.9451622Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:51:21.9456181Z cuda train timm_resnest 2025-09-07T10:51:37.2042226Z Autotune Choices Stats: 2025-09-07T10:51:37.2043805Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_14", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.03276799991726875, "best_triton_pos": 0} 2025-09-07T10:51:37.2126590Z AUTOTUNE convolution(4x32x112x112, 64x32x3x3) 2025-09-07T10:51:37.2127011Z strides: [401408, 12544, 112, 1], [288, 9, 3, 1] 2025-09-07T10:51:37.2127360Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:37.2128291Z triton_convolution2d_14 0.0328 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:37.2129802Z triton_convolution2d_19 0.0358 ms 91.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:37.2131303Z triton_convolution2d_16 0.0379 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:37.2132785Z triton_convolution2d_17 0.0399 ms 82.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:37.2134248Z triton_convolution2d_13 0.0410 ms 80.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:37.2135158Z convolution 0.0451 ms 72.7% 2025-09-07T10:51:37.2136011Z triton_convolution2d_18 0.0451 ms 72.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:37.2137543Z triton_convolution2d_15 0.0881 ms 37.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:51:37.2139136Z SingleProcess AUTOTUNE benchmarking takes 0.1453 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:51:37.6958935Z Autotune Choices Stats: 2025-09-07T10:51:37.6960307Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_45", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T10:51:37.7042363Z AUTOTUNE convolution(4x64x56x56, 256x64x1x1) 2025-09-07T10:51:37.7042770Z strides: [200704, 3136, 56, 1], [64, 1, 1, 1] 2025-09-07T10:51:37.7043302Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:37.7044246Z triton_convolution2d_45 0.0143 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:37.7045774Z triton_convolution2d_47 0.0143 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:37.7047269Z triton_convolution2d_48 0.0143 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:37.7048766Z triton_convolution2d_43 0.0154 ms 93.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:37.7050571Z triton_convolution2d_46 0.0154 ms 93.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:37.7052067Z triton_convolution2d_42 0.0164 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:37.7053556Z triton_convolution2d_44 0.0195 ms 73.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:37.7054470Z convolution 0.0215 ms 66.7% 2025-09-07T10:51:37.7054762Z conv1x1_via_mm 0.0748 ms 19.2% 2025-09-07T10:51:37.7055336Z SingleProcess AUTOTUNE benchmarking takes 0.1562 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:40.3207602Z Autotune Choices Stats: 2025-09-07T10:51:40.3209012Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_7", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.027648000046610832, "best_triton_pos": 0} 2025-09-07T10:51:40.3296467Z AUTOTUNE convolution(4x32x112x112, 32x32x3x3) 2025-09-07T10:51:40.3296935Z strides: [401408, 12544, 112, 1], [288, 9, 3, 1] 2025-09-07T10:51:40.3297343Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:40.3298315Z triton_convolution2d_7 0.0276 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:40.3299846Z triton_convolution2d_12 0.0328 ms 84.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:40.3301699Z triton_convolution2d_6 0.0338 ms 81.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:40.3303172Z triton_convolution2d_9 0.0348 ms 79.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:40.3304651Z triton_convolution2d_10 0.0389 ms 71.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:51:40.3306133Z triton_convolution2d_11 0.0399 ms 69.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:51:40.3307040Z convolution 0.0420 ms 65.9% 2025-09-07T10:51:40.3307937Z triton_convolution2d_8 0.0492 ms 56.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:51:40.3309111Z SingleProcess AUTOTUNE benchmarking takes 0.1377 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:51:40.5156095Z Autotune Choices Stats: 2025-09-07T10:51:40.5158335Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_53", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.018432000651955605, "best_triton_pos": 0} 2025-09-07T10:51:40.5241975Z AUTOTUNE convolution(4x256x56x56, 128x256x1x1) 2025-09-07T10:51:40.5242601Z strides: [802816, 3136, 56, 1], [256, 1, 1, 1] 2025-09-07T10:51:40.5243455Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:40.5244992Z triton_convolution2d_53 0.0184 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.5247486Z triton_convolution2d_54 0.0195 ms 94.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.5249993Z triton_convolution2d_52 0.0215 ms 85.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.5252487Z triton_convolution2d_55 0.0215 ms 85.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.5254959Z triton_convolution2d_49 0.0236 ms 78.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.5257507Z triton_convolution2d_50 0.0256 ms 72.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.5259992Z triton_convolution2d_51 0.0328 ms 56.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:40.5261528Z convolution 0.0358 ms 51.4% 2025-09-07T10:51:40.5262000Z conv1x1_via_mm 0.0840 ms 22.0% 2025-09-07T10:51:40.5262914Z SingleProcess AUTOTUNE benchmarking takes 0.1605 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:40.9802740Z Autotune Choices Stats: 2025-09-07T10:51:40.9804698Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_76", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T10:51:40.9889932Z AUTOTUNE convolution(4x256x28x28, 512x256x1x1) 2025-09-07T10:51:40.9890335Z strides: [200704, 784, 28, 1], [256, 1, 1, 1] 2025-09-07T10:51:40.9890687Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:40.9891608Z triton_convolution2d_76 0.0174 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.9893132Z triton_convolution2d_75 0.0205 ms 85.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.9894634Z triton_convolution2d_78 0.0205 ms 85.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.9896126Z triton_convolution2d_77 0.0215 ms 81.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:40.9897701Z triton_convolution2d_73 0.0246 ms 70.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.9899468Z triton_convolution2d_72 0.0256 ms 68.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:40.9900373Z convolution 0.0287 ms 60.7% 2025-09-07T10:51:40.9901267Z triton_convolution2d_74 0.0451 ms 38.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:40.9902184Z conv1x1_via_mm 0.0625 ms 27.9% 2025-09-07T10:51:40.9902760Z SingleProcess AUTOTUNE benchmarking takes 0.1677 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:42.4461384Z Autotune Choices Stats: 2025-09-07T10:51:42.4462749Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_69", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:51:42.4544839Z AUTOTUNE convolution(4x128x28x28, 512x128x1x1) 2025-09-07T10:51:42.4545229Z strides: [100352, 784, 28, 1], [128, 1, 1, 1] 2025-09-07T10:51:42.4545575Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:42.4546506Z triton_convolution2d_69 0.0133 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.4548011Z triton_convolution2d_68 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.4549499Z triton_convolution2d_70 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.4550980Z triton_convolution2d_71 0.0154 ms 86.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.4552812Z triton_convolution2d_66 0.0164 ms 81.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.4554311Z triton_convolution2d_65 0.0174 ms 76.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.4555219Z convolution 0.0205 ms 65.0% 2025-09-07T10:51:42.4556115Z triton_convolution2d_67 0.0266 ms 50.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:42.4557015Z conv1x1_via_mm 0.0502 ms 26.5% 2025-09-07T10:51:42.4557590Z SingleProcess AUTOTUNE benchmarking takes 0.1533 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:42.6434650Z Autotune Choices Stats: 2025-09-07T10:51:42.6436001Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_24", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:51:42.6520382Z AUTOTUNE convolution(4x64x56x56, 64x64x1x1) 2025-09-07T10:51:42.6520744Z strides: [200704, 3136, 56, 1], [64, 1, 1, 1] 2025-09-07T10:51:42.6521397Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:42.6522311Z triton_convolution2d_24 0.0102 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.6524013Z triton_convolution2d_25 0.0102 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.6525501Z triton_convolution2d_20 0.0113 ms 90.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.6526995Z triton_convolution2d_23 0.0113 ms 90.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.6527899Z convolution 0.0123 ms 83.3% 2025-09-07T10:51:42.6528771Z triton_convolution2d_26 0.0123 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:42.6530257Z triton_convolution2d_21 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:42.6531748Z triton_convolution2d_22 0.0133 ms 76.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:42.6532657Z conv1x1_via_mm 0.0399 ms 25.6% 2025-09-07T10:51:42.6533230Z SingleProcess AUTOTUNE benchmarking takes 0.1493 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:43.1100581Z Autotune Choices Stats: 2025-09-07T10:51:43.1102635Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_convolution2d_83", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:51:43.1187673Z AUTOTUNE convolution(4x512x28x28, 256x512x1x1) 2025-09-07T10:51:43.1188098Z strides: [401408, 784, 28, 1], [512, 1, 1, 1] 2025-09-07T10:51:43.1188458Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:43.1188773Z convolution 0.0184 ms 100.0% 2025-09-07T10:51:43.1189672Z triton_convolution2d_83 0.0236 ms 78.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.1191194Z triton_convolution2d_82 0.0307 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.1192696Z triton_convolution2d_85 0.0317 ms 58.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.1194180Z triton_convolution2d_84 0.0328 ms 56.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.1195670Z triton_convolution2d_80 0.0389 ms 47.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.1197447Z triton_convolution2d_79 0.0420 ms 43.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.1198933Z triton_convolution2d_81 0.0532 ms 34.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:43.1199840Z conv1x1_via_mm 0.0655 ms 28.1% 2025-09-07T10:51:43.1200411Z SingleProcess AUTOTUNE benchmarking takes 0.1708 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:43.5777978Z Autotune Choices Stats: 2025-09-07T10:51:43.5779347Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_108", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.023552000522613525, "best_triton_pos": 0} 2025-09-07T10:51:43.5868014Z AUTOTUNE convolution(4x512x14x14, 1024x512x1x1) 2025-09-07T10:51:43.5868423Z strides: [100352, 196, 14, 1], [512, 1, 1, 1] 2025-09-07T10:51:43.5868773Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:43.5869701Z triton_convolution2d_108 0.0236 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.5870617Z convolution 0.0246 ms 95.8% 2025-09-07T10:51:43.5871496Z triton_convolution2d_109 0.0287 ms 82.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.5872988Z triton_convolution2d_107 0.0317 ms 74.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.5874493Z triton_convolution2d_110 0.0328 ms 71.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:43.5876314Z triton_convolution2d_104 0.0420 ms 56.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.5877807Z triton_convolution2d_105 0.0430 ms 54.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:43.5879291Z triton_convolution2d_106 0.0522 ms 45.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:43.5880209Z conv1x1_via_mm 0.0584 ms 40.4% 2025-09-07T10:51:43.5880779Z SingleProcess AUTOTUNE benchmarking takes 0.1710 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:45.0764170Z Autotune Choices Stats: 2025-09-07T10:51:45.0765548Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_101", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:51:45.0850651Z AUTOTUNE convolution(4x256x14x14, 1024x256x1x1) 2025-09-07T10:51:45.0851064Z strides: [50176, 196, 14, 1], [256, 1, 1, 1] 2025-09-07T10:51:45.0851415Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:45.0852344Z triton_convolution2d_101 0.0164 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.0854235Z triton_convolution2d_102 0.0195 ms 84.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.0855749Z triton_convolution2d_100 0.0205 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.0857285Z triton_convolution2d_103 0.0215 ms 76.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.0858206Z convolution 0.0225 ms 72.7% 2025-09-07T10:51:45.0859082Z triton_convolution2d_97 0.0256 ms 64.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.0860570Z triton_convolution2d_98 0.0256 ms 64.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.0862062Z triton_convolution2d_99 0.0307 ms 53.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:45.0862971Z conv1x1_via_mm 0.0430 ms 38.1% 2025-09-07T10:51:45.0863539Z SingleProcess AUTOTUNE benchmarking takes 0.1588 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:45.3115002Z Autotune Choices Stats: 2025-09-07T10:51:45.3116713Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.025599999353289604, "best_triton_pos": 1, "best_triton_time": 0.035840000957250595, "best_triton_kernel": "triton_convolution2d_115", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:51:45.3202700Z AUTOTUNE convolution(4x1024x14x14, 512x1024x1x1) 2025-09-07T10:51:45.3203694Z strides: [200704, 196, 14, 1], [1024, 1, 1, 1] 2025-09-07T10:51:45.3204056Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:45.3204364Z convolution 0.0256 ms 100.0% 2025-09-07T10:51:45.3205254Z triton_convolution2d_115 0.0358 ms 71.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.3206745Z triton_convolution2d_116 0.0481 ms 53.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.3208257Z triton_convolution2d_114 0.0532 ms 48.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.3209751Z triton_convolution2d_117 0.0543 ms 47.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.3210660Z conv1x1_via_mm 0.0573 ms 44.6% 2025-09-07T10:51:45.3211534Z triton_convolution2d_111 0.0758 ms 33.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.3213022Z triton_convolution2d_112 0.0768 ms 33.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.3214688Z triton_convolution2d_113 0.0932 ms 27.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:45.3215864Z SingleProcess AUTOTUNE benchmarking takes 0.1869 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:45.7273657Z Autotune Choices Stats: 2025-09-07T10:51:45.7275343Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_convolution2d_128", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:51:45.7364923Z AUTOTUNE convolution(4x256x1x1, 1024x256x1x1) 2025-09-07T10:51:45.7365284Z strides: [256, 1, 1, 1], [256, 1, 1, 1] 2025-09-07T10:51:45.7365614Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:45.7365940Z convolution 0.0113 ms 100.0% 2025-09-07T10:51:45.7366843Z triton_convolution2d_128 0.0123 ms 91.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.7368335Z triton_convolution2d_127 0.0133 ms 84.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.7369233Z conv1x1_via_mm 0.0143 ms 78.6% 2025-09-07T10:51:45.7370111Z triton_convolution2d_125 0.0143 ms 78.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.7371598Z triton_convolution2d_126 0.0143 ms 78.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:45.7373412Z triton_convolution2d_129 0.0154 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:45.7374911Z triton_convolution2d_124 0.0195 ms 57.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:45.7376086Z SingleProcess AUTOTUNE benchmarking takes 0.1442 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:51:46.1523270Z Autotune Choices Stats: 2025-09-07T10:51:46.1524992Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_convolution2d_93", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1"} 2025-09-07T10:51:46.1611428Z AUTOTUNE convolution(4x128x1x1, 512x128x1x1) 2025-09-07T10:51:46.1611772Z strides: [128, 1, 1, 1], [128, 1, 1, 1] 2025-09-07T10:51:46.1612098Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:46.1612416Z convolution 0.0092 ms 100.0% 2025-09-07T10:51:46.1613302Z triton_convolution2d_93 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:46.1614799Z triton_convolution2d_94 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:46.1616536Z triton_convolution2d_95 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.1618102Z triton_convolution2d_92 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.1619008Z conv1x1_via_mm 0.0113 ms 81.8% 2025-09-07T10:51:46.1619905Z triton_convolution2d_96 0.0113 ms 81.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:46.1621463Z triton_convolution2d_91 0.0123 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.1622637Z SingleProcess AUTOTUNE benchmarking takes 0.1386 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:51:46.4125253Z Autotune Choices Stats: 2025-09-07T10:51:46.4126582Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_60", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:51:46.4213860Z AUTOTUNE convolution(4x64x1x1, 256x64x1x1) 2025-09-07T10:51:46.4214281Z strides: [64, 1, 1, 1], [64, 1, 1, 1] 2025-09-07T10:51:46.4214609Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:46.4215535Z triton_convolution2d_60 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.4217063Z triton_convolution2d_61 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:46.4218942Z triton_convolution2d_62 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:46.4220424Z triton_convolution2d_63 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.4221915Z triton_convolution2d_64 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:46.4222844Z convolution 0.0092 ms 88.9% 2025-09-07T10:51:46.4223725Z triton_convolution2d_59 0.0092 ms 88.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.4224630Z conv1x1_via_mm 0.0113 ms 72.7% 2025-09-07T10:51:46.4225201Z SingleProcess AUTOTUNE benchmarking takes 0.1373 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:51:46.9723228Z Autotune Choices Stats: 2025-09-07T10:51:46.9724606Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "triton_convolution2d_30", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:51:46.9808797Z AUTOTUNE convolution(4x32x1x1, 128x32x1x1) 2025-09-07T10:51:46.9809247Z strides: [32, 1, 1, 1], [32, 1, 1, 1] 2025-09-07T10:51:46.9809564Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:46.9810521Z triton_convolution2d_30 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.9812023Z triton_convolution2d_31 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.9813505Z triton_convolution2d_32 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:46.9814991Z triton_convolution2d_33 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:46.9816473Z triton_convolution2d_34 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:46.9817421Z convolution 0.0082 ms 87.5% 2025-09-07T10:51:46.9817709Z conv1x1_via_mm 0.0113 ms 63.6% 2025-09-07T10:51:46.9818282Z SingleProcess AUTOTUNE benchmarking takes 0.1240 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:51:47.6853414Z Autotune Choices Stats: 2025-09-07T10:51:47.6855098Z {"num_choices": 5, "num_triton_choices": 3, "best_kernel": "convolution", "best_time": 0.009216000325977802, "best_triton_pos": 1, "best_triton_time": 0.009216000325977802, "best_triton_kernel": "triton_convolution2d_57", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1"} 2025-09-07T10:51:47.6939105Z AUTOTUNE convolution(4x128x1x1, 64x128x1x1) 2025-09-07T10:51:47.6939488Z strides: [128, 1, 1, 1], [128, 1, 1, 1] 2025-09-07T10:51:47.6939822Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:47.6940430Z convolution 0.0092 ms 100.0% 2025-09-07T10:51:47.6941329Z triton_convolution2d_57 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:47.6942805Z triton_convolution2d_58 0.0092 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:47.6944359Z triton_convolution2d_56 0.0102 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:47.6945318Z conv1x1_via_mm 0.0113 ms 81.8% 2025-09-07T10:51:47.6945905Z SingleProcess AUTOTUNE benchmarking takes 0.0985 seconds and 0.0002 seconds precompiling for 5 choices 2025-09-07T10:51:48.1422696Z Autotune Choices Stats: 2025-09-07T10:51:48.1424090Z {"num_choices": 5, "num_triton_choices": 3, "best_kernel": "triton_convolution2d_29", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=2", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:51:48.1511901Z AUTOTUNE convolution(4x64x1x1, 32x64x1x1) 2025-09-07T10:51:48.1512309Z strides: [64, 1, 1, 1], [64, 1, 1, 1] 2025-09-07T10:51:48.1512961Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:48.1513876Z triton_convolution2d_29 0.0072 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=2 2025-09-07T10:51:48.1514789Z convolution 0.0082 ms 87.5% 2025-09-07T10:51:48.1515667Z triton_convolution2d_27 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=2 2025-09-07T10:51:48.1517133Z triton_convolution2d_28 0.0082 ms 87.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=1 2025-09-07T10:51:48.1518013Z conv1x1_via_mm 0.0113 ms 63.6% 2025-09-07T10:51:48.1518582Z SingleProcess AUTOTUNE benchmarking takes 0.1019 seconds and 0.0002 seconds precompiling for 5 choices 2025-09-07T10:51:49.2461146Z Autotune Choices Stats: 2025-09-07T10:51:49.2462513Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_134", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.02252800017595291, "best_triton_pos": 0} 2025-09-07T10:51:49.2550494Z AUTOTUNE convolution(4x512x7x7, 2048x512x1x1) 2025-09-07T10:51:49.2550908Z strides: [25088, 49, 7, 1], [512, 1, 1, 1] 2025-09-07T10:51:49.2551252Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:49.2552293Z triton_convolution2d_134 0.0225 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.2553216Z convolution 0.0236 ms 95.7% 2025-09-07T10:51:49.2554100Z triton_convolution2d_135 0.0307 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.2555620Z triton_convolution2d_133 0.0317 ms 71.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.2557632Z triton_convolution2d_136 0.0369 ms 61.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.2559130Z triton_convolution2d_130 0.0389 ms 57.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.2560615Z triton_convolution2d_132 0.0410 ms 55.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:49.2562115Z triton_convolution2d_131 0.0461 ms 48.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.2563302Z conv1x1_via_mm 0.0522 ms 43.1% 2025-09-07T10:51:49.2563882Z SingleProcess AUTOTUNE benchmarking takes 0.1704 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:49.4355522Z Autotune Choices Stats: 2025-09-07T10:51:49.4357201Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.027648000046610832, "best_triton_pos": 1, "best_triton_time": 0.03788800165057182, "best_triton_kernel": "triton_convolution2d_141", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:51:49.4446399Z AUTOTUNE convolution(4x1024x7x7, 2048x1024x1x1) 2025-09-07T10:51:49.4446799Z strides: [50176, 49, 7, 1], [1024, 1, 1, 1] 2025-09-07T10:51:49.4447131Z dtypes: torch.float16, torch.float16 2025-09-07T10:51:49.4447477Z convolution 0.0276 ms 100.0% 2025-09-07T10:51:49.4448376Z triton_convolution2d_141 0.0379 ms 73.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.4449871Z triton_convolution2d_142 0.0512 ms 54.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.4451364Z triton_convolution2d_140 0.0532 ms 51.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.4452309Z conv1x1_via_mm 0.0655 ms 42.2% 2025-09-07T10:51:49.4453186Z triton_convolution2d_143 0.0655 ms 42.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:51:49.4454688Z triton_convolution2d_137 0.0707 ms 39.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.4456187Z triton_convolution2d_139 0.0727 ms 38.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:51:49.4457746Z triton_convolution2d_138 0.0840 ms 32.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:51:49.4458946Z SingleProcess AUTOTUNE benchmarking takes 0.1890 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:51:49.7489138Z Autotune Choices Stats: 2025-09-07T10:51:49.7490718Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_148", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:51:49.7580454Z AUTOTUNE addmm(4x1000, 4x2048, 2048x1000) 2025-09-07T10:51:49.7580801Z strides: [0, 1], [2048, 1], [1, 2048] 2025-09-07T10:51:49.7581164Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:51:49.7581986Z triton_mm_148 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:51:49.7583216Z triton_mm_152 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:51:49.7583980Z bias_addmm 0.0184 ms 72.2% 2025-09-07T10:51:49.7584729Z triton_mm_156 0.0184 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:51:49.7585907Z triton_mm_160 0.0195 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:51:49.7586661Z addmm 0.0205 ms 65.0% 2025-09-07T10:51:49.7587361Z triton_mm_147 0.0205 ms 65.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:51:49.7588736Z triton_mm_146 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:51:49.7589915Z triton_mm_151 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:51:49.7591099Z triton_mm_145 0.0225 ms 59.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:51:49.7592120Z SingleProcess AUTOTUNE benchmarking takes 0.3123 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:52:02.7729062Z Autotune Choices Stats: 2025-09-07T10:52:02.7730294Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_178", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:52:02.7826159Z AUTOTUNE mm(1000x4, 4x2048) 2025-09-07T10:52:02.7826465Z strides: [1, 1000], [2048, 1] 2025-09-07T10:52:02.7826829Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:02.7827613Z triton_mm_178 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:52:02.7828821Z triton_mm_180 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:02.7830003Z triton_mm_182 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:02.7831194Z triton_mm_183 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:52:02.7832390Z triton_mm_184 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:02.7833991Z triton_mm_185 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:02.7835212Z triton_mm_186 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:02.7836416Z triton_mm_187 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:02.7837640Z triton_mm_188 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:52:02.7838844Z triton_mm_189 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:02.7839870Z SingleProcess AUTOTUNE benchmarking takes 0.2204 seconds and 0.0006 seconds precompiling for 17 choices 2025-09-07T10:52:03.5296945Z Autotune Choices Stats: 2025-09-07T10:52:03.5298245Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_165", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:52:03.5390980Z AUTOTUNE mm(4x1000, 1000x2048) 2025-09-07T10:52:03.5391283Z strides: [1000, 1], [2048, 1] 2025-09-07T10:52:03.5391948Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:03.5392726Z triton_mm_165 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:52:03.5393961Z triton_mm_169 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:03.5395172Z triton_mm_173 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:52:03.5396381Z triton_mm_177 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:03.5397146Z mm 0.0143 ms 92.9% 2025-09-07T10:52:03.5397822Z triton_mm_163 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:03.5399010Z triton_mm_164 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:52:03.5400199Z triton_mm_168 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:03.5401384Z triton_mm_172 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:03.5402577Z triton_mm_171 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:03.5403832Z SingleProcess AUTOTUNE benchmarking takes 0.2567 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:52:15.4175916Z pass 2025-09-07T10:52:19.1981267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:52:19.1986721Z import pynvml # type: ignore[import] 2025-09-07T10:52:22.1109243Z 2025-09-07T10:52:24.7952082Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:52:24.7952453Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:52:24.7952793Z cuda train timm_vision_transformer 2025-09-07T10:52:42.4209814Z Autotune Choices Stats: 2025-09-07T10:52:42.4211037Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_52", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:52:42.4309093Z AUTOTUNE addmm(788x1536, 788x384, 384x1536) 2025-09-07T10:52:42.4309436Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:52:42.4309794Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:52:42.4310628Z triton_mm_52 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:42.4311829Z triton_mm_54 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:42.4313025Z triton_mm_55 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:42.4314587Z triton_mm_59 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:42.4315334Z bias_addmm 0.0174 ms 94.1% 2025-09-07T10:52:42.4316048Z triton_mm_50 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:42.4317227Z triton_mm_53 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:42.4318405Z triton_mm_56 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:42.4319586Z triton_mm_58 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:42.4320778Z triton_mm_60 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:42.4321789Z SingleProcess AUTOTUNE benchmarking takes 0.3075 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:52:44.0812207Z Autotune Choices Stats: 2025-09-07T10:52:44.0813956Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.10444799810647964, "best_triton_pos": 1, "best_triton_time": 0.1085439994931221, "best_triton_kernel": "triton_convolution2d_4", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:52:44.0907845Z AUTOTUNE convolution(4x3x224x224, 384x3x16x16) 2025-09-07T10:52:44.0908263Z strides: [150528, 50176, 224, 1], [768, 256, 16, 1] 2025-09-07T10:52:44.0908629Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:44.0908938Z convolution 0.1044 ms 100.0% 2025-09-07T10:52:44.0909843Z triton_convolution2d_4 0.1085 ms 96.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:52:44.0911701Z triton_convolution2d_6 0.1864 ms 56.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:52:44.0913227Z triton_convolution2d_3 0.2038 ms 51.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:52:44.0914738Z triton_convolution2d_1 0.2099 ms 49.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:52:44.0916251Z triton_convolution2d_0 0.2734 ms 38.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:52:44.0917756Z triton_convolution2d_5 0.2744 ms 38.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:52:44.0919266Z triton_convolution2d_2 0.4239 ms 24.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:52:44.0920439Z SingleProcess AUTOTUNE benchmarking takes 0.2651 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:52:44.3817539Z Autotune Choices Stats: 2025-09-07T10:52:44.3819091Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_14", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T10:52:44.3911948Z AUTOTUNE addmm(788x1152, 788x384, 384x1152) 2025-09-07T10:52:44.3912285Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:52:44.3912638Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:52:44.3913013Z bias_addmm 0.0164 ms 100.0% 2025-09-07T10:52:44.3913734Z triton_mm_14 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:44.3914929Z triton_mm_16 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.3916116Z triton_mm_18 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.3917309Z triton_mm_19 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.3918495Z triton_mm_22 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.3919698Z triton_mm_23 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.3920892Z triton_mm_24 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:44.3922076Z triton_mm_17 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.3923821Z triton_mm_20 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.3924848Z SingleProcess AUTOTUNE benchmarking takes 0.2992 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:52:44.6384751Z Autotune Choices Stats: 2025-09-07T10:52:44.6386257Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_mm_32", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T10:52:44.6479767Z AUTOTUNE mm(788x384, 384x384) 2025-09-07T10:52:44.6480083Z strides: [384, 1], [1, 384] 2025-09-07T10:52:44.6480371Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:44.6480684Z mm 0.0113 ms 100.0% 2025-09-07T10:52:44.6481401Z triton_mm_32 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:44.6482602Z triton_mm_33 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:44.6484046Z triton_mm_36 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.6485528Z triton_mm_26 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:52:44.6486707Z triton_mm_27 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:44.6487890Z triton_mm_35 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.6489071Z triton_mm_38 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.6490243Z triton_mm_28 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:44.6491417Z triton_mm_29 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:44.6492439Z SingleProcess AUTOTUNE benchmarking takes 0.2557 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:52:44.9368680Z Autotune Choices Stats: 2025-09-07T10:52:44.9371596Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_69", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:52:44.9464967Z AUTOTUNE mm(788x1536, 1536x384) 2025-09-07T10:52:44.9465548Z strides: [1536, 1], [1, 1536] 2025-09-07T10:52:44.9466088Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:44.9466617Z mm 0.0164 ms 100.0% 2025-09-07T10:52:44.9467854Z triton_mm_69 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:44.9470378Z triton_mm_68 0.0205 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:44.9471844Z triton_mm_65 0.0225 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:44.9473030Z triton_mm_62 0.0236 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:52:44.9474215Z triton_mm_72 0.0236 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:44.9475530Z triton_mm_64 0.0266 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:44.9476717Z triton_mm_71 0.0266 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.9478627Z triton_mm_74 0.0266 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:44.9480741Z triton_mm_63 0.0276 ms 59.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:44.9482677Z SingleProcess AUTOTUNE benchmarking takes 0.2972 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:52:45.3117572Z Autotune Choices Stats: 2025-09-07T10:52:45.3118796Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_875", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:52:45.3213745Z AUTOTUNE addmm(4x1000, 4x384, 384x1000) 2025-09-07T10:52:45.3214109Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:52:45.3214464Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:52:45.3215284Z triton_mm_875 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:52:45.3216473Z triton_mm_879 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:45.3217727Z triton_mm_872 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:52:45.3218920Z triton_mm_873 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:45.3220105Z triton_mm_874 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:52:45.3221279Z triton_mm_878 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:45.3222456Z triton_mm_882 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:45.3223797Z triton_mm_883 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:52:45.3225340Z triton_mm_885 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:52:45.3226536Z triton_mm_887 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:45.3227570Z SingleProcess AUTOTUNE benchmarking takes 0.2737 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:52:57.9452528Z Autotune Choices Stats: 2025-09-07T10:52:57.9453733Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_932", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T10:52:57.9551957Z AUTOTUNE mm(788x384, 384x1536) 2025-09-07T10:52:57.9552280Z strides: [384, 1], [1536, 1] 2025-09-07T10:52:57.9552585Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:57.9553370Z triton_mm_932 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:57.9554561Z triton_mm_930 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:57.9555760Z triton_mm_933 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:57.9557298Z triton_mm_937 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:57.9558507Z triton_mm_938 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:57.9559266Z mm 0.0164 ms 87.5% 2025-09-07T10:52:57.9559947Z triton_mm_928 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:57.9561139Z triton_mm_931 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:57.9562311Z triton_mm_936 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:57.9563731Z triton_mm_934 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:57.9564767Z SingleProcess AUTOTUNE benchmarking takes 0.2653 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T10:52:58.8618724Z Autotune Choices Stats: 2025-09-07T10:52:58.8620240Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_949", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8"} 2025-09-07T10:52:58.8718882Z AUTOTUNE mm(384x788, 788x1536) 2025-09-07T10:52:58.8719250Z strides: [1, 384], [1536, 1] 2025-09-07T10:52:58.8719562Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:58.8719877Z mm 0.0154 ms 100.0% 2025-09-07T10:52:58.8720612Z triton_mm_949 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:58.8722214Z triton_mm_952 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:58.8723645Z triton_mm_950 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:58.8724841Z triton_mm_946 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:58.8726018Z triton_mm_947 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:58.8727225Z triton_mm_951 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:58.8728435Z triton_mm_956 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:58.8729630Z triton_mm_941 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:58.8730816Z triton_mm_948 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:58.8732031Z SingleProcess AUTOTUNE benchmarking takes 0.2783 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:52:59.3640154Z Autotune Choices Stats: 2025-09-07T10:52:59.3641689Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_985", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8"} 2025-09-07T10:52:59.3736471Z AUTOTUNE mm(1536x788, 788x384) 2025-09-07T10:52:59.3736851Z strides: [1, 1536], [384, 1] 2025-09-07T10:52:59.3737155Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:59.3737501Z mm 0.0154 ms 100.0% 2025-09-07T10:52:59.3738214Z triton_mm_985 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:59.3739445Z triton_mm_988 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:59.3740641Z triton_mm_986 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:59.3741846Z triton_mm_982 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:59.3743036Z triton_mm_983 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:59.3744230Z triton_mm_987 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:59.3745421Z triton_mm_992 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:59.3746618Z triton_mm_977 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:59.3748112Z triton_mm_978 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:59.3749157Z SingleProcess AUTOTUNE benchmarking takes 0.2768 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:52:59.9851129Z Autotune Choices Stats: 2025-09-07T10:52:59.9852677Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_1055", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:52:59.9948410Z AUTOTUNE mm(1152x788, 788x384) 2025-09-07T10:52:59.9948730Z strides: [1, 1152], [384, 1] 2025-09-07T10:52:59.9949026Z dtypes: torch.float16, torch.float16 2025-09-07T10:52:59.9949337Z mm 0.0143 ms 100.0% 2025-09-07T10:52:59.9950057Z triton_mm_1055 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:59.9951268Z triton_mm_1054 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:52:59.9952474Z triton_mm_1060 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:59.9954014Z triton_mm_1049 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:59.9955228Z triton_mm_1050 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:52:59.9956434Z triton_mm_1057 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:52:59.9957629Z triton_mm_1058 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:52:59.9958843Z triton_mm_1051 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:52:59.9960054Z triton_mm_1048 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:52:59.9961097Z SingleProcess AUTOTUNE benchmarking takes 0.2739 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:53:00.5748726Z Autotune Choices Stats: 2025-09-07T10:53:00.5750842Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_917", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:53:00.5848953Z AUTOTUNE mm(1000x4, 4x384) 2025-09-07T10:53:00.5849507Z strides: [1, 1000], [384, 1] 2025-09-07T10:53:00.5850109Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:00.5851470Z triton_mm_917 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:00.5853629Z triton_mm_905 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:53:00.5856242Z triton_mm_906 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:00.5858347Z triton_mm_907 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:00.5860439Z triton_mm_908 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:00.5862525Z triton_mm_909 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:00.5864614Z triton_mm_910 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:00.5866679Z triton_mm_911 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:53:00.5868729Z triton_mm_912 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:00.5870814Z triton_mm_913 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:00.5872939Z SingleProcess AUTOTUNE benchmarking takes 0.2220 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:53:01.0639104Z Autotune Choices Stats: 2025-09-07T10:53:01.0641921Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_mm_1015", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:53:01.0737517Z AUTOTUNE mm(384x788, 788x384) 2025-09-07T10:53:01.0738093Z strides: [1, 384], [384, 1] 2025-09-07T10:53:01.0738610Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:01.0739165Z mm 0.0123 ms 100.0% 2025-09-07T10:53:01.0740514Z triton_mm_1015 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:01.0742832Z triton_mm_1019 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:01.0745124Z triton_mm_1013 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:01.0747364Z triton_mm_1014 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:01.0749614Z triton_mm_1018 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:53:01.0751889Z triton_mm_1012 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:01.0754168Z triton_mm_1021 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:01.0756878Z triton_mm_1022 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:01.0759186Z triton_mm_1024 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:01.0761119Z SingleProcess AUTOTUNE benchmarking takes 0.2677 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:53:02.3429029Z Autotune Choices Stats: 2025-09-07T10:53:02.3430295Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_892", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:53:02.3525696Z AUTOTUNE mm(4x1000, 1000x384) 2025-09-07T10:53:02.3526019Z strides: [1000, 1], [384, 1] 2025-09-07T10:53:02.3526357Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:02.3527144Z triton_mm_892 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:53:02.3528374Z triton_mm_896 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.3529121Z mm 0.0123 ms 83.3% 2025-09-07T10:53:02.3529827Z triton_mm_900 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:53:02.3534751Z triton_mm_890 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.3535950Z triton_mm_891 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:53:02.3537141Z triton_mm_904 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:02.3538443Z triton_mm_895 0.0143 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:02.3539623Z triton_mm_889 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:53:02.3540822Z triton_mm_899 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:02.3541860Z SingleProcess AUTOTUNE benchmarking takes 0.2544 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:53:02.6326591Z Autotune Choices Stats: 2025-09-07T10:53:02.6328045Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_965", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:53:02.6422398Z AUTOTUNE mm(788x1536, 1536x384) 2025-09-07T10:53:02.6422715Z strides: [1536, 1], [384, 1] 2025-09-07T10:53:02.6423006Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:02.6423326Z mm 0.0154 ms 100.0% 2025-09-07T10:53:02.6424033Z triton_mm_965 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.6425517Z triton_mm_964 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:53:02.6426721Z triton_mm_961 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.6427911Z triton_mm_968 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:02.6429111Z triton_mm_958 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:02.6430287Z triton_mm_970 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:02.6431479Z triton_mm_967 0.0256 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:02.6432659Z triton_mm_959 0.0266 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:02.6433827Z triton_mm_960 0.0266 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:02.6435013Z SingleProcess AUTOTUNE benchmarking takes 0.2890 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:53:02.8849052Z Autotune Choices Stats: 2025-09-07T10:53:02.8850496Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_mm_1000", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T10:53:02.8945060Z AUTOTUNE mm(788x384, 384x384) 2025-09-07T10:53:02.8945378Z strides: [384, 1], [384, 1] 2025-09-07T10:53:02.8945675Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:02.8946001Z mm 0.0113 ms 100.0% 2025-09-07T10:53:02.8946715Z triton_mm_1000 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:53:02.8947917Z triton_mm_1001 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.8949141Z triton_mm_1004 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:02.8950337Z triton_mm_994 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:02.8951530Z triton_mm_1003 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:02.8952727Z triton_mm_1006 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:02.8953921Z triton_mm_995 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:02.8955294Z triton_mm_996 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:02.8956473Z triton_mm_997 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:02.8957507Z SingleProcess AUTOTUNE benchmarking takes 0.2511 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:53:03.1621203Z Autotune Choices Stats: 2025-09-07T10:53:03.1622663Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.015359999611973763, "best_triton_kernel": "triton_mm_1037", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:53:03.1717683Z AUTOTUNE mm(788x1152, 1152x384) 2025-09-07T10:53:03.1718008Z strides: [1152, 1], [384, 1] 2025-09-07T10:53:03.1718332Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:03.1718643Z mm 0.0143 ms 100.0% 2025-09-07T10:53:03.1719349Z triton_mm_1037 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:03.1720548Z triton_mm_1036 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:53:03.1721965Z triton_mm_1040 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:53:03.1723378Z triton_mm_1033 0.0195 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:53:03.1724622Z triton_mm_1030 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:53:03.1725821Z triton_mm_1031 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:03.1727006Z triton_mm_1039 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:03.1728215Z triton_mm_1042 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:53:03.1729407Z triton_mm_1032 0.0225 ms 63.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:53:03.1730436Z SingleProcess AUTOTUNE benchmarking takes 0.2760 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:53:08.4958213Z pass 2025-09-07T10:53:12.7269267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:53:12.7270684Z import pynvml # type: ignore[import] 2025-09-07T10:53:15.4020432Z 2025-09-07T10:53:29.5416338Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:53:29.5416738Z loading model: 0it [00:14, ?it/s] 2025-09-07T10:53:29.5417080Z cuda train timm_vision_transformer_large 2025-09-07T10:53:29.5651113Z pass_due_to_skip 2025-09-07T10:53:31.5840296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:53:31.5843294Z import pynvml # type: ignore[import] 2025-09-07T10:53:34.2875414Z 2025-09-07T10:53:37.1227845Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:53:37.1228223Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:53:37.1228539Z cuda train timm_vovnet 2025-09-07T10:53:54.0892262Z Autotune Choices Stats: 2025-09-07T10:53:54.0894079Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_1", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T10:53:54.1001306Z AUTOTUNE convolution(4x3x224x224, 64x3x3x3) 2025-09-07T10:53:54.1002041Z strides: [150528, 50176, 224, 1], [27, 9, 3, 1] 2025-09-07T10:53:54.1002485Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:54.1129719Z triton_convolution2d_1 0.0174 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:54.1131261Z triton_convolution2d_5 0.0174 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.1132765Z triton_convolution2d_0 0.0195 ms 89.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:54.1134598Z triton_convolution2d_3 0.0205 ms 85.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.1136083Z triton_convolution2d_4 0.0225 ms 77.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.1136977Z convolution 0.0307 ms 56.7% 2025-09-07T10:53:54.1137958Z triton_convolution2d_2 0.0307 ms 56.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:54.1139144Z SingleProcess AUTOTUNE benchmarking takes 0.1363 seconds and 0.0009 seconds precompiling for 7 choices 2025-09-07T10:53:54.3518249Z Autotune Choices Stats: 2025-09-07T10:53:54.3520586Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_7", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.05734400078654289, "best_triton_pos": 0} 2025-09-07T10:53:54.3625757Z AUTOTUNE convolution(4x64x112x112, 64x64x3x3) 2025-09-07T10:53:54.3626187Z strides: [802816, 12544, 112, 1], [576, 9, 3, 1] 2025-09-07T10:53:54.3626552Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:54.3627459Z triton_convolution2d_7 0.0573 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:54.3628400Z convolution 0.0614 ms 93.3% 2025-09-07T10:53:54.3629278Z triton_convolution2d_12 0.0614 ms 93.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.3631126Z triton_convolution2d_9 0.0696 ms 82.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.3632617Z triton_convolution2d_11 0.0778 ms 73.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:54.3634100Z triton_convolution2d_10 0.0809 ms 70.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:54.3635581Z triton_convolution2d_6 0.0819 ms 70.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:54.3637045Z triton_convolution2d_8 0.1792 ms 32.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:54.3638218Z SingleProcess AUTOTUNE benchmarking takes 0.1947 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:53:54.5869276Z Autotune Choices Stats: 2025-09-07T10:53:54.5870970Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04095999896526337, "best_triton_pos": 1, "best_triton_time": 0.04915200173854828, "best_triton_kernel": "triton_convolution2d_58", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:53:54.5972228Z AUTOTUNE convolution(4x768x56x56, 256x768x1x1) 2025-09-07T10:53:54.5972665Z strides: [2408448, 3136, 56, 1], [768, 1, 1, 1] 2025-09-07T10:53:54.5973019Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:54.5973339Z convolution 0.0410 ms 100.0% 2025-09-07T10:53:54.5974240Z triton_convolution2d_58 0.0492 ms 83.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:54.5975745Z triton_convolution2d_60 0.0502 ms 81.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:54.5977328Z triton_convolution2d_61 0.0512 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:54.5978853Z triton_convolution2d_56 0.0604 ms 67.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:54.5980326Z triton_convolution2d_59 0.0635 ms 64.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:54.5981794Z triton_convolution2d_55 0.0676 ms 60.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:54.5983276Z triton_convolution2d_57 0.0952 ms 43.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:53:54.5984177Z conv1x1_via_mm 0.2079 ms 19.7% 2025-09-07T10:53:54.5984749Z SingleProcess AUTOTUNE benchmarking takes 0.1973 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T10:53:55.1068102Z Autotune Choices Stats: 2025-09-07T10:53:55.1069882Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_19", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.043007999658584595, "best_triton_pos": 0} 2025-09-07T10:53:55.1169915Z AUTOTUNE convolution(4x64x112x112, 128x64x3x3) 2025-09-07T10:53:55.1170295Z strides: [802816, 12544, 112, 1], [576, 9, 3, 1] 2025-09-07T10:53:55.1178409Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:55.1179348Z triton_convolution2d_19 0.0430 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.1180305Z convolution 0.0461 ms 93.3% 2025-09-07T10:53:55.1181201Z triton_convolution2d_16 0.0522 ms 82.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.1182696Z triton_convolution2d_14 0.0543 ms 79.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.1184173Z triton_convolution2d_13 0.0584 ms 73.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.1185914Z triton_convolution2d_18 0.0594 ms 72.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.1187400Z triton_convolution2d_17 0.0758 ms 56.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.1188897Z triton_convolution2d_15 0.1741 ms 24.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:55.1190079Z SingleProcess AUTOTUNE benchmarking takes 0.1750 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:55.3592771Z Autotune Choices Stats: 2025-09-07T10:53:55.3594476Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04608000069856644, "best_triton_pos": 1, "best_triton_time": 0.07680000364780426, "best_triton_kernel": "triton_convolution2d_26", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:53:55.3691164Z AUTOTUNE convolution(4x128x56x56, 128x128x3x3) 2025-09-07T10:53:55.3691576Z strides: [401408, 3136, 56, 1], [1152, 9, 3, 1] 2025-09-07T10:53:55.3691920Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:55.3692242Z convolution 0.0461 ms 100.0% 2025-09-07T10:53:55.3693133Z triton_convolution2d_26 0.0768 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.3694626Z triton_convolution2d_21 0.0973 ms 47.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.3696132Z triton_convolution2d_20 0.1055 ms 43.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.3698073Z triton_convolution2d_23 0.1055 ms 43.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.3699560Z triton_convolution2d_24 0.1126 ms 40.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:55.3701047Z triton_convolution2d_25 0.1198 ms 38.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:55.3702542Z triton_convolution2d_22 0.1823 ms 25.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:55.3703714Z SingleProcess AUTOTUNE benchmarking takes 0.2019 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:55.6200914Z Autotune Choices Stats: 2025-09-07T10:53:55.6202591Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03788800165057182, "best_triton_pos": 1, "best_triton_time": 0.04198399931192398, "best_triton_kernel": "triton_convolution2d_101", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:53:55.6300359Z AUTOTUNE convolution(4x1056x28x28, 512x1056x1x1) 2025-09-07T10:53:55.6301122Z strides: [827904, 784, 28, 1], [1056, 1, 1, 1] 2025-09-07T10:53:55.6301476Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:55.6301786Z convolution 0.0379 ms 100.0% 2025-09-07T10:53:55.6302674Z triton_convolution2d_101 0.0420 ms 90.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:55.6304187Z triton_convolution2d_100 0.0543 ms 69.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:55.6305684Z triton_convolution2d_103 0.0553 ms 68.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:55.6307176Z triton_convolution2d_102 0.0563 ms 67.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:55.6308664Z triton_convolution2d_98 0.0686 ms 55.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:55.6310143Z triton_convolution2d_97 0.0737 ms 51.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:55.6311623Z triton_convolution2d_99 0.1536 ms 24.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:53:55.6312535Z conv1x1_via_mm 0.1679 ms 22.6% 2025-09-07T10:53:55.6313112Z SingleProcess AUTOTUNE benchmarking takes 0.2028 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:53:56.1545822Z Autotune Choices Stats: 2025-09-07T10:53:56.1547937Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03481600061058998, "best_triton_pos": 1, "best_triton_time": 0.050175998359918594, "best_triton_kernel": "triton_convolution2d_143", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:53:56.1649980Z AUTOTUNE convolution(4x1472x14x14, 768x1472x1x1) 2025-09-07T10:53:56.1650385Z strides: [288512, 196, 14, 1], [1472, 1, 1, 1] 2025-09-07T10:53:56.1650740Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:56.1651066Z convolution 0.0348 ms 100.0% 2025-09-07T10:53:56.1651977Z triton_convolution2d_143 0.0502 ms 69.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.1653499Z triton_convolution2d_144 0.0645 ms 54.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.1655007Z triton_convolution2d_142 0.0717 ms 48.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.1656487Z triton_convolution2d_145 0.0748 ms 46.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.1658072Z triton_convolution2d_139 0.0952 ms 36.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.1659911Z triton_convolution2d_140 0.1065 ms 32.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.1661418Z triton_convolution2d_141 0.1321 ms 26.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:53:56.1662331Z conv1x1_via_mm 0.1382 ms 25.2% 2025-09-07T10:53:56.1662901Z SingleProcess AUTOTUNE benchmarking takes 0.2211 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T10:53:56.6695843Z Autotune Choices Stats: 2025-09-07T10:53:56.6697607Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03891199827194214, "best_triton_pos": 1, "best_triton_time": 0.05734400078654289, "best_triton_kernel": "triton_convolution2d_185", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:53:56.6796493Z AUTOTUNE convolution(4x1728x14x14, 768x1728x1x1) 2025-09-07T10:53:56.6796888Z strides: [338688, 196, 14, 1], [1728, 1, 1, 1] 2025-09-07T10:53:56.6797251Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:56.6797559Z convolution 0.0389 ms 100.0% 2025-09-07T10:53:56.6798450Z triton_convolution2d_185 0.0573 ms 67.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.6799932Z triton_convolution2d_186 0.0748 ms 52.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.6801437Z triton_convolution2d_184 0.0829 ms 46.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.6803500Z triton_convolution2d_187 0.0860 ms 45.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:56.6805003Z triton_convolution2d_181 0.1106 ms 35.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.6806488Z triton_convolution2d_182 0.1239 ms 31.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:56.6807399Z conv1x1_via_mm 0.1362 ms 28.6% 2025-09-07T10:53:56.6808288Z triton_convolution2d_183 0.1546 ms 25.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:53:56.6809471Z SingleProcess AUTOTUNE benchmarking takes 0.2263 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:53:56.9267944Z Autotune Choices Stats: 2025-09-07T10:53:56.9269562Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04915200173854828, "best_triton_pos": 1, "best_triton_time": 0.12390399724245071, "best_triton_kernel": "triton_convolution2d_66", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:56.9368998Z AUTOTUNE convolution(4x256x28x28, 160x256x3x3) 2025-09-07T10:53:56.9369817Z strides: [200704, 784, 28, 1], [2304, 9, 3, 1] 2025-09-07T10:53:56.9370155Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:56.9370474Z convolution 0.0492 ms 100.0% 2025-09-07T10:53:56.9371364Z triton_convolution2d_66 0.1239 ms 39.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:56.9372871Z triton_convolution2d_68 0.1341 ms 36.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:56.9374372Z triton_convolution2d_65 0.1536 ms 32.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:56.9375861Z triton_convolution2d_63 0.1669 ms 29.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:56.9377435Z triton_convolution2d_67 0.1679 ms 29.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:56.9378917Z triton_convolution2d_62 0.1915 ms 25.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:56.9380418Z triton_convolution2d_64 0.2632 ms 18.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:56.9381591Z SingleProcess AUTOTUNE benchmarking takes 0.2435 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:57.4198277Z Autotune Choices Stats: 2025-09-07T10:53:57.4201924Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.07782399654388428, "best_triton_kernel": "triton_convolution2d_73", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:57.4300357Z AUTOTUNE convolution(4x160x28x28, 160x160x3x3) 2025-09-07T10:53:57.4300776Z strides: [125440, 784, 28, 1], [1440, 9, 3, 1] 2025-09-07T10:53:57.4301125Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:57.4301433Z convolution 0.0420 ms 100.0% 2025-09-07T10:53:57.4302321Z triton_convolution2d_73 0.0778 ms 53.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:57.4303839Z triton_convolution2d_75 0.0850 ms 49.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:57.4305341Z triton_convolution2d_72 0.1014 ms 41.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:57.4306834Z triton_convolution2d_70 0.1096 ms 38.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:57.4308313Z triton_convolution2d_74 0.1157 ms 36.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:57.4310101Z triton_convolution2d_69 0.1249 ms 33.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:57.4311583Z triton_convolution2d_71 0.1679 ms 25.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:57.4312765Z SingleProcess AUTOTUNE benchmarking takes 0.2012 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:53:57.6902595Z Autotune Choices Stats: 2025-09-07T10:53:57.6904259Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.027648000046610832, "best_triton_pos": 1, "best_triton_time": 0.060416001826524734, "best_triton_kernel": "triton_convolution2d_227", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:53:57.7002195Z AUTOTUNE convolution(4x1888x7x7, 1024x1888x1x1) 2025-09-07T10:53:57.7003189Z strides: [92512, 49, 7, 1], [1888, 1, 1, 1] 2025-09-07T10:53:57.7003815Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:57.7004343Z convolution 0.0276 ms 100.0% 2025-09-07T10:53:57.7005959Z triton_convolution2d_227 0.0604 ms 45.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:57.7008722Z triton_convolution2d_228 0.0840 ms 32.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:57.7011422Z triton_convolution2d_226 0.0891 ms 31.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:57.7014282Z triton_convolution2d_229 0.1126 ms 24.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:53:57.7017508Z triton_convolution2d_223 0.1178 ms 23.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:57.7020307Z triton_convolution2d_225 0.1229 ms 22.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:53:57.7022024Z conv1x1_via_mm 0.1300 ms 21.3% 2025-09-07T10:53:57.7023683Z triton_convolution2d_224 0.1423 ms 19.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:53:57.7025882Z SingleProcess AUTOTUNE benchmarking takes 0.2263 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:53:58.2716297Z Autotune Choices Stats: 2025-09-07T10:53:58.2719391Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04710400104522705, "best_triton_pos": 1, "best_triton_time": 0.1884160041809082, "best_triton_kernel": "triton_convolution2d_108", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:58.2814702Z AUTOTUNE convolution(4x512x14x14, 192x512x3x3) 2025-09-07T10:53:58.2815129Z strides: [100352, 196, 14, 1], [4608, 9, 3, 1] 2025-09-07T10:53:58.2815825Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:58.2816137Z convolution 0.0471 ms 100.0% 2025-09-07T10:53:58.2817037Z triton_convolution2d_108 0.1884 ms 25.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:58.2818630Z triton_convolution2d_110 0.2560 ms 18.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:58.2820144Z triton_convolution2d_107 0.3000 ms 15.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:58.2821639Z triton_convolution2d_105 0.3277 ms 14.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:58.2823132Z triton_convolution2d_109 0.3482 ms 13.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:58.2824628Z triton_convolution2d_104 0.4137 ms 11.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:58.2826134Z triton_convolution2d_106 0.4639 ms 10.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:58.2827301Z SingleProcess AUTOTUNE benchmarking takes 0.2918 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:59.1770196Z Autotune Choices Stats: 2025-09-07T10:53:59.1771928Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03174399957060814, "best_triton_pos": 1, "best_triton_time": 0.07475200295448303, "best_triton_kernel": "triton_convolution2d_115", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:59.1875455Z AUTOTUNE convolution(4x192x14x14, 192x192x3x3) 2025-09-07T10:53:59.1875855Z strides: [37632, 196, 14, 1], [1728, 9, 3, 1] 2025-09-07T10:53:59.1876209Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:59.1876544Z convolution 0.0317 ms 100.0% 2025-09-07T10:53:59.1877416Z triton_convolution2d_115 0.0748 ms 42.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.1878925Z triton_convolution2d_117 0.1014 ms 31.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.1880421Z triton_convolution2d_114 0.1157 ms 27.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.1881918Z triton_convolution2d_112 0.1290 ms 24.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.1883655Z triton_convolution2d_116 0.1362 ms 23.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.1885145Z triton_convolution2d_111 0.1628 ms 19.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.1886817Z triton_convolution2d_113 0.1833 ms 17.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:59.1887997Z SingleProcess AUTOTUNE benchmarking takes 0.6119 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:59.5171373Z Autotune Choices Stats: 2025-09-07T10:53:59.5173052Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.055296000093221664, "best_triton_pos": 1, "best_triton_time": 0.27136000990867615, "best_triton_kernel": "triton_convolution2d_150", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:59.5272385Z AUTOTUNE convolution(4x768x14x14, 192x768x3x3) 2025-09-07T10:53:59.5272802Z strides: [150528, 196, 14, 1], [6912, 9, 3, 1] 2025-09-07T10:53:59.5273165Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:59.5273477Z convolution 0.0553 ms 100.0% 2025-09-07T10:53:59.5274371Z triton_convolution2d_150 0.2714 ms 20.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.5275868Z triton_convolution2d_152 0.3799 ms 14.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.5277369Z triton_convolution2d_149 0.4444 ms 12.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.5278874Z triton_convolution2d_147 0.4905 ms 11.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.5280698Z triton_convolution2d_151 0.5253 ms 10.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.5282199Z triton_convolution2d_146 0.6287 ms 8.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.5283910Z triton_convolution2d_148 0.6994 ms 7.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:59.5285091Z SingleProcess AUTOTUNE benchmarking takes 0.2957 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:53:59.8705025Z Autotune Choices Stats: 2025-09-07T10:53:59.8706743Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.05222399905323982, "best_triton_pos": 1, "best_triton_time": 0.24985599517822266, "best_triton_kernel": "triton_convolution2d_192", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:53:59.8805982Z AUTOTUNE convolution(4x768x7x7, 224x768x3x3) 2025-09-07T10:53:59.8806367Z strides: [37632, 49, 7, 1], [6912, 9, 3, 1] 2025-09-07T10:53:59.8806697Z dtypes: torch.float16, torch.float16 2025-09-07T10:53:59.8807024Z convolution 0.0522 ms 100.0% 2025-09-07T10:53:59.8807910Z triton_convolution2d_192 0.2499 ms 20.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.8809717Z triton_convolution2d_190 0.3092 ms 16.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:53:59.8811226Z triton_convolution2d_194 0.3758 ms 13.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.8812722Z triton_convolution2d_191 0.4342 ms 12.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.8814217Z triton_convolution2d_189 0.4506 ms 11.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.8815702Z triton_convolution2d_193 0.5857 ms 8.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:53:59.8817201Z triton_convolution2d_188 0.8090 ms 6.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:53:59.8818437Z SingleProcess AUTOTUNE benchmarking takes 0.2958 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:54:00.3787045Z Autotune Choices Stats: 2025-09-07T10:54:00.3790063Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06656000018119812, "best_triton_pos": 1, "best_triton_time": 0.07577600330114365, "best_triton_kernel": "triton_convolution2d_199", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:54:00.3890240Z AUTOTUNE convolution(4x224x7x7, 224x224x3x3) 2025-09-07T10:54:00.3890914Z strides: [10976, 49, 7, 1], [2016, 9, 3, 1] 2025-09-07T10:54:00.3891926Z dtypes: torch.float16, torch.float16 2025-09-07T10:54:00.3892500Z convolution 0.0666 ms 100.0% 2025-09-07T10:54:00.3894126Z triton_convolution2d_199 0.0758 ms 87.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.3896932Z triton_convolution2d_197 0.0942 ms 70.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:54:00.3899832Z triton_convolution2d_201 0.1178 ms 56.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.3902779Z triton_convolution2d_198 0.1331 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.3905603Z triton_convolution2d_196 0.1475 ms 45.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.3908416Z triton_convolution2d_200 0.1751 ms 38.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.3911574Z triton_convolution2d_195 0.2499 ms 26.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.3913786Z SingleProcess AUTOTUNE benchmarking takes 0.2236 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:54:00.7240340Z Autotune Choices Stats: 2025-09-07T10:54:00.7243939Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.060416001826524734, "best_triton_pos": 1, "best_triton_time": 0.34406399726867676, "best_triton_kernel": "triton_convolution2d_234", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:54:00.7353076Z AUTOTUNE convolution(4x1024x7x7, 224x1024x3x3) 2025-09-07T10:54:00.7353831Z strides: [50176, 49, 7, 1], [9216, 9, 3, 1] 2025-09-07T10:54:00.7354442Z dtypes: torch.float16, torch.float16 2025-09-07T10:54:00.7354997Z convolution 0.0604 ms 100.0% 2025-09-07T10:54:00.7356621Z triton_convolution2d_234 0.3441 ms 17.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.7359478Z triton_convolution2d_232 0.4076 ms 14.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:54:00.7362246Z triton_convolution2d_236 0.5007 ms 12.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.7365271Z triton_convolution2d_233 0.5652 ms 10.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.7368157Z triton_convolution2d_231 0.6154 ms 9.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.7371444Z triton_convolution2d_235 0.7516 ms 8.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:54:00.7374339Z triton_convolution2d_230 1.1151 ms 5.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:54:00.7376546Z SingleProcess AUTOTUNE benchmarking takes 0.3021 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:54:01.7702926Z Autotune Choices Stats: 2025-09-07T10:54:01.7704648Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.028672000393271446, "best_triton_pos": 2, "best_triton_time": 0.06758400052785873, "best_triton_kernel": "triton_convolution2d_269", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:54:01.7809161Z AUTOTUNE convolution(4x2144x7x7, 1024x2144x1x1) 2025-09-07T10:54:01.7809544Z strides: [105056, 49, 7, 1], [2144, 1, 1, 1] 2025-09-07T10:54:01.7809890Z dtypes: torch.float16, torch.float16 2025-09-07T10:54:01.7810197Z convolution 0.0287 ms 100.0% 2025-09-07T10:54:01.7810509Z conv1x1_via_mm 0.0666 ms 43.1% 2025-09-07T10:54:01.7811401Z triton_convolution2d_269 0.0676 ms 42.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:54:01.7813257Z triton_convolution2d_270 0.0942 ms 30.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:54:01.7814768Z triton_convolution2d_268 0.1004 ms 28.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:54:01.7816264Z triton_convolution2d_271 0.1260 ms 22.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:54:01.7817826Z triton_convolution2d_265 0.1321 ms 21.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:54:01.7819328Z triton_convolution2d_267 0.1382 ms 20.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:54:01.7820812Z triton_convolution2d_266 0.1597 ms 17.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:54:01.7821989Z SingleProcess AUTOTUNE benchmarking takes 0.2365 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:54:02.0622060Z Autotune Choices Stats: 2025-09-07T10:54:02.0623296Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_276", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:54:02.0729153Z AUTOTUNE addmm(4x1000, 4x1024, 1024x1000) 2025-09-07T10:54:02.0729766Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T10:54:02.0730378Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:54:02.0736205Z triton_mm_276 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:54:02.0738633Z triton_mm_280 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:02.0740859Z triton_mm_284 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:54:02.0742994Z triton_mm_288 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:54:02.0744374Z bias_addmm 0.0143 ms 78.6% 2025-09-07T10:54:02.0745666Z triton_mm_274 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:02.0747719Z triton_mm_275 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:54:02.0749833Z triton_mm_273 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:54:02.0751926Z triton_mm_279 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:54:02.0754494Z triton_mm_283 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:54:02.0756391Z SingleProcess AUTOTUNE benchmarking takes 0.2907 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:54:15.4743900Z Autotune Choices Stats: 2025-09-07T10:54:15.4745156Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_306", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:54:15.4854570Z AUTOTUNE mm(1000x4, 4x1024) 2025-09-07T10:54:15.4854884Z strides: [1, 1000], [1024, 1] 2025-09-07T10:54:15.4855191Z dtypes: torch.float16, torch.float16 2025-09-07T10:54:15.4855947Z triton_mm_306 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:54:15.4857166Z triton_mm_308 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:54:15.4858439Z triton_mm_309 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:54:15.4859637Z triton_mm_310 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:15.4860842Z triton_mm_311 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:54:15.4862034Z triton_mm_312 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:54:15.4863233Z triton_mm_313 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:15.4864812Z triton_mm_314 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:54:15.4866038Z triton_mm_315 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:54:15.4867249Z triton_mm_316 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:54:15.4868293Z SingleProcess AUTOTUNE benchmarking takes 0.2211 seconds and 0.0003 seconds precompiling for 17 choices 2025-09-07T10:54:16.1508396Z Autotune Choices Stats: 2025-09-07T10:54:16.1509598Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_293", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:54:16.1612520Z AUTOTUNE mm(4x1000, 1000x1024) 2025-09-07T10:54:16.1612876Z strides: [1000, 1], [1024, 1] 2025-09-07T10:54:16.1613183Z dtypes: torch.float16, torch.float16 2025-09-07T10:54:16.1613977Z triton_mm_293 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:54:16.1615198Z triton_mm_297 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:16.1616755Z triton_mm_301 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:54:16.1617598Z mm 0.0133 ms 84.6% 2025-09-07T10:54:16.1618294Z triton_mm_305 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:54:16.1619555Z triton_mm_291 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:54:16.1620745Z triton_mm_292 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:54:16.1621927Z triton_mm_296 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:54:16.1623125Z triton_mm_300 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:54:16.1624325Z triton_mm_290 0.0164 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:54:16.1625351Z SingleProcess AUTOTUNE benchmarking takes 0.2525 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:54:25.1323308Z pass 2025-09-07T10:54:29.2087340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:54:29.2088786Z import pynvml # type: ignore[import] 2025-09-07T10:54:32.0199570Z 2025-09-07T10:54:34.4206845Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:54:34.4207222Z loading model: 0it [00:02, ?it/s] 2025-09-07T10:54:34.4207562Z cuda train torch_multimodal_clip 2025-09-07T10:55:07.9035479Z Autotune Choices Stats: 2025-09-07T10:55:07.9037609Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.03481600061058998, "best_triton_pos": 1, "best_triton_time": 0.035840000957250595, "best_triton_kernel": "triton_mm_940", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:07.9143228Z AUTOTUNE addmm(2464x2048, 2464x512, 512x2048) 2025-09-07T10:55:07.9143573Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:55:07.9143930Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:55:07.9144326Z bias_addmm 0.0348 ms 100.0% 2025-09-07T10:55:07.9145064Z triton_mm_940 0.0358 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:07.9146275Z triton_mm_939 0.0379 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:07.9147463Z triton_mm_933 0.0389 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:07.9148647Z triton_mm_936 0.0399 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:07.9149828Z triton_mm_935 0.0410 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:07.9151255Z triton_mm_941 0.0420 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:07.9152452Z triton_mm_934 0.0440 ms 79.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:07.9153640Z triton_mm_937 0.0461 ms 75.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:07.9154380Z addmm 0.0502 ms 69.4% 2025-09-07T10:55:07.9154924Z SingleProcess AUTOTUNE benchmarking takes 0.3905 seconds and 0.0006 seconds precompiling for 20 choices 2025-09-07T10:55:09.1805528Z Autotune Choices Stats: 2025-09-07T10:55:09.1807074Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.018432000651955605, "best_triton_kernel": "triton_mm_54", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:09.1916468Z AUTOTUNE addmm(200x3072, 200x768, 768x3072) 2025-09-07T10:55:09.1916884Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:55:09.1917232Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:55:09.1917610Z bias_addmm 0.0184 ms 100.0% 2025-09-07T10:55:09.1918353Z triton_mm_54 0.0184 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:09.1919547Z triton_mm_50 0.0195 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:09.1920733Z triton_mm_53 0.0195 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:09.1921904Z triton_mm_56 0.0195 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:09.1923699Z triton_mm_51 0.0215 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:09.1924903Z triton_mm_60 0.0215 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:09.1926095Z triton_mm_52 0.0225 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:09.1927290Z triton_mm_55 0.0225 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:09.1928476Z triton_mm_59 0.0225 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:09.1929514Z SingleProcess AUTOTUNE benchmarking takes 0.3167 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:55:10.0029701Z Autotune Choices Stats: 2025-09-07T10:55:10.0031204Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.02457600086927414, "best_triton_pos": 1, "best_triton_time": 0.02457600086927414, "best_triton_kernel": "triton_mm_861", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:55:10.0137518Z AUTOTUNE mm(200x3072, 3072x768) 2025-09-07T10:55:10.0137827Z strides: [3072, 1], [1, 3072] 2025-09-07T10:55:10.0138127Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:10.0138433Z mm 0.0246 ms 100.0% 2025-09-07T10:55:10.0139186Z triton_mm_861 0.0246 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:10.0140392Z triton_mm_857 0.0266 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:10.0141581Z triton_mm_855 0.0307 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:10.0142761Z triton_mm_856 0.0307 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:10.0143936Z triton_mm_854 0.0338 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:10.0145130Z triton_mm_860 0.0348 ms 70.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:10.0146312Z triton_mm_864 0.0389 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:10.0147503Z triton_mm_863 0.0440 ms 55.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:10.0148688Z triton_mm_866 0.0461 ms 53.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:10.0149720Z SingleProcess AUTOTUNE benchmarking takes 0.3503 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:11.7809842Z Autotune Choices Stats: 2025-09-07T10:55:11.7811808Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_4", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.46694400906562805, "best_triton_pos": 0} 2025-09-07T10:55:11.7925407Z AUTOTUNE convolution(4x3x224x224, 768x3x32x32) 2025-09-07T10:55:11.7925856Z strides: [150528, 50176, 224, 1], [3072, 1024, 32, 1] 2025-09-07T10:55:11.7926225Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:11.7927128Z triton_convolution2d_4 0.4669 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:55:11.7928661Z triton_convolution2d_6 0.8172 ms 57.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:55:11.7930182Z triton_convolution2d_3 0.8837 ms 52.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:55:11.7931693Z triton_convolution2d_1 0.9544 ms 48.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:55:11.7932603Z convolution 0.9554 ms 48.9% 2025-09-07T10:55:11.7933697Z triton_convolution2d_2 1.0660 ms 43.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:55:11.7935200Z triton_convolution2d_5 1.0988 ms 42.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:55:11.7936704Z triton_convolution2d_0 1.2401 ms 37.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=32, KERNEL_W=32, PADDING_H=0, PADDING_W=0, STRIDE_H=32, STRIDE_W=32, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:55:11.7937923Z SingleProcess AUTOTUNE benchmarking takes 0.3310 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:55:12.0613441Z Autotune Choices Stats: 2025-09-07T10:55:12.0614899Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_18", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:12.0730399Z AUTOTUNE mm(200x768, 768x2304) 2025-09-07T10:55:12.0730705Z strides: [768, 1], [1, 768] 2025-09-07T10:55:12.0731007Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:12.0731322Z mm 0.0154 ms 100.0% 2025-09-07T10:55:12.0732020Z triton_mm_18 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.0733199Z triton_mm_14 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:12.0734379Z triton_mm_17 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:12.0735565Z triton_mm_20 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:12.0736954Z triton_mm_24 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:12.0738198Z triton_mm_8 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:12.0739384Z triton_mm_15 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:12.0740570Z triton_mm_16 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.0741752Z triton_mm_19 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.0742782Z SingleProcess AUTOTUNE benchmarking takes 0.2798 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T10:55:12.3339188Z Autotune Choices Stats: 2025-09-07T10:55:12.3340659Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_mm_29", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:55:12.3450438Z AUTOTUNE mm(200x768, 768x768) 2025-09-07T10:55:12.3451130Z strides: [768, 1], [1, 768] 2025-09-07T10:55:12.3451419Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:12.3451728Z mm 0.0123 ms 100.0% 2025-09-07T10:55:12.3452461Z triton_mm_29 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:12.3453669Z triton_mm_33 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:12.3454852Z triton_mm_27 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:12.3456018Z triton_mm_28 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:12.3457185Z triton_mm_26 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:12.3458442Z triton_mm_32 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:12.3459619Z triton_mm_36 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.3460805Z triton_mm_35 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:12.3461982Z triton_mm_38 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:12.3463017Z SingleProcess AUTOTUNE benchmarking takes 0.2705 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:12.7067997Z Autotune Choices Stats: 2025-09-07T10:55:12.7069505Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_875", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:55:12.7179384Z AUTOTUNE mm(4x768, 768x512) 2025-09-07T10:55:12.7179673Z strides: [768, 1], [512, 1] 2025-09-07T10:55:12.7179968Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:12.7180741Z triton_mm_875 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:55:12.7181943Z triton_mm_879 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:12.7183146Z triton_mm_874 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:55:12.7184334Z triton_mm_887 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:12.7185064Z mm 0.0123 ms 75.0% 2025-09-07T10:55:12.7185755Z triton_mm_873 0.0123 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:12.7186938Z triton_mm_878 0.0123 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.7188356Z triton_mm_883 0.0123 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:55:12.7189550Z triton_mm_872 0.0133 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:55:12.7190744Z triton_mm_882 0.0133 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:12.7191755Z SingleProcess AUTOTUNE benchmarking takes 0.2527 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:55:13.0329011Z Autotune Choices Stats: 2025-09-07T10:55:13.0330194Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_899", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.030719999223947525, "best_triton_pos": 0} 2025-09-07T10:55:13.0442132Z AUTOTUNE mm(2464x512, 512x1536) 2025-09-07T10:55:13.0442446Z strides: [512, 1], [1, 512] 2025-09-07T10:55:13.0442744Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:13.0443728Z triton_mm_899 0.0307 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.0444934Z triton_mm_897 0.0317 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.0446124Z triton_mm_900 0.0317 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.0446869Z mm 0.0328 ms 93.7% 2025-09-07T10:55:13.0447572Z triton_mm_903 0.0328 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.0448764Z triton_mm_904 0.0328 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.0450209Z triton_mm_898 0.0358 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:13.0451393Z triton_mm_901 0.0358 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:13.0452582Z triton_mm_905 0.0369 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:13.0453773Z triton_mm_894 0.0399 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:13.0454799Z SingleProcess AUTOTUNE benchmarking takes 0.3256 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:13.3118069Z Autotune Choices Stats: 2025-09-07T10:55:13.3121665Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.016383999958634377, "best_triton_kernel": "triton_mm_923", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:55:13.3240858Z AUTOTUNE mm(2464x512, 512x512) 2025-09-07T10:55:13.3241527Z strides: [512, 1], [1, 512] 2025-09-07T10:55:13.3242184Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:13.3243509Z mm 0.0164 ms 100.0% 2025-09-07T10:55:13.3245225Z triton_mm_923 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:13.3248141Z triton_mm_917 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.3251035Z triton_mm_922 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.3253887Z triton_mm_913 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:13.3256740Z triton_mm_915 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.3259674Z triton_mm_916 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:13.3262558Z triton_mm_918 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.3265436Z triton_mm_919 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:13.3268320Z triton_mm_921 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.3270774Z SingleProcess AUTOTUNE benchmarking takes 0.2790 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:13.6637197Z Autotune Choices Stats: 2025-09-07T10:55:13.6641138Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.03481600061058998, "best_triton_pos": 1, "best_triton_time": 0.04198399931192398, "best_triton_kernel": "triton_mm_953", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:13.6749580Z AUTOTUNE mm(2464x2048, 2048x512) 2025-09-07T10:55:13.6749899Z strides: [2048, 1], [1, 2048] 2025-09-07T10:55:13.6750203Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:13.6750495Z mm 0.0348 ms 100.0% 2025-09-07T10:55:13.6751213Z triton_mm_953 0.0420 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.6752415Z triton_mm_958 0.0430 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.6753623Z triton_mm_949 0.0451 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:13.6754817Z triton_mm_959 0.0451 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:13.6756002Z triton_mm_950 0.0471 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:13.6757171Z triton_mm_954 0.0492 ms 70.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.6758356Z triton_mm_951 0.0512 ms 68.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.6759761Z triton_mm_952 0.0512 ms 68.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:13.6760958Z triton_mm_957 0.0522 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:13.6761996Z SingleProcess AUTOTUNE benchmarking takes 0.3486 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:14.0396519Z Autotune Choices Stats: 2025-09-07T10:55:14.0397697Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1756", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:55:14.0512197Z AUTOTUNE mm(32x512, 512x512) 2025-09-07T10:55:14.0512517Z strides: [512, 1], [1, 512] 2025-09-07T10:55:14.0512820Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:14.0513598Z triton_mm_1756 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:14.0514823Z triton_mm_1760 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:14.0516016Z triton_mm_1755 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:14.0516773Z mm 0.0113 ms 81.8% 2025-09-07T10:55:14.0517466Z triton_mm_1753 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:14.0518668Z triton_mm_1754 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:14.0520087Z triton_mm_1759 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:14.0521290Z triton_mm_1763 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:14.0522482Z triton_mm_1764 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:55:14.0523928Z triton_mm_1766 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:14.0524965Z SingleProcess AUTOTUNE benchmarking takes 0.2476 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:55:40.4182891Z Autotune Choices Stats: 2025-09-07T10:55:40.4185509Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.03276799991726875, "best_triton_pos": 1, "best_triton_time": 0.03379200026392937, "best_triton_kernel": "triton_mm_1818", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:40.4297267Z AUTOTUNE mm(2464x512, 512x2048) 2025-09-07T10:55:40.4297681Z strides: [512, 1], [2048, 1] 2025-09-07T10:55:40.4297971Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:40.4298280Z mm 0.0328 ms 100.0% 2025-09-07T10:55:40.4299366Z triton_mm_1818 0.0338 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:40.4300597Z triton_mm_1819 0.0338 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:40.4301820Z triton_mm_1812 0.0358 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:40.4303005Z triton_mm_1815 0.0358 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:40.4304203Z triton_mm_1814 0.0369 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:40.4305409Z triton_mm_1820 0.0399 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:40.4306620Z triton_mm_1813 0.0410 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:40.4307817Z triton_mm_1816 0.0420 ms 78.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:40.4309013Z triton_mm_1809 0.0461 ms 71.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:40.4310044Z SingleProcess AUTOTUNE benchmarking takes 0.3378 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T10:55:41.2614977Z Autotune Choices Stats: 2025-09-07T10:55:41.2616229Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_3591", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:55:41.2726140Z AUTOTUNE mm(768x200, 200x3072) 2025-09-07T10:55:41.2726860Z strides: [1, 768], [3072, 1] 2025-09-07T10:55:41.2727225Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:41.2728011Z triton_mm_3591 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.2728792Z mm 0.0164 ms 93.7% 2025-09-07T10:55:41.2729506Z triton_mm_3592 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:41.2730737Z triton_mm_3593 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.2731952Z triton_mm_3594 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.2733171Z triton_mm_3597 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.2734380Z triton_mm_3595 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:41.2735580Z triton_mm_3596 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:55:41.2736979Z triton_mm_3598 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.2738263Z triton_mm_3587 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:41.2739308Z SingleProcess AUTOTUNE benchmarking takes 0.2671 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:41.7659300Z Autotune Choices Stats: 2025-09-07T10:55:41.7660535Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_3627", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T10:55:41.7769689Z AUTOTUNE mm(3072x200, 200x768) 2025-09-07T10:55:41.7770017Z strides: [1, 3072], [768, 1] 2025-09-07T10:55:41.7770320Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:41.7771091Z triton_mm_3627 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.7772341Z triton_mm_3630 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.7773102Z mm 0.0164 ms 93.7% 2025-09-07T10:55:41.7773816Z triton_mm_3633 0.0164 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.7775026Z triton_mm_3628 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:41.7776240Z triton_mm_3629 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.7777494Z triton_mm_3631 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:41.7779087Z triton_mm_3632 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:55:41.7780314Z triton_mm_3634 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:41.7781522Z triton_mm_3623 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:41.7782559Z SingleProcess AUTOTUNE benchmarking takes 0.2839 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:42.3714714Z Autotune Choices Stats: 2025-09-07T10:55:42.3715957Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_3681", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T10:55:42.3824824Z AUTOTUNE mm(2304x200, 200x768) 2025-09-07T10:55:42.3825140Z strides: [1, 2304], [768, 1] 2025-09-07T10:55:42.3825443Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:42.3826232Z triton_mm_3681 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:42.3827763Z triton_mm_3684 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:42.3828986Z triton_mm_3682 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:42.3830205Z triton_mm_3683 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:42.3831411Z triton_mm_3685 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:42.3835104Z triton_mm_3687 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:42.3836342Z triton_mm_3688 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:42.3837559Z triton_mm_3689 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:42.3838317Z mm 0.0154 ms 86.7% 2025-09-07T10:55:42.3839014Z triton_mm_3678 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:42.3840052Z SingleProcess AUTOTUNE benchmarking takes 0.2608 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:43.0874602Z Autotune Choices Stats: 2025-09-07T10:55:43.0877257Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.03993599861860275, "best_triton_pos": 1, "best_triton_time": 0.04915200173854828, "best_triton_kernel": "triton_mm_1832", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:43.0984368Z AUTOTUNE mm(512x2464, 2464x2048) 2025-09-07T10:55:43.0984909Z strides: [1, 512], [2048, 1] 2025-09-07T10:55:43.0985386Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:43.0986318Z mm 0.0399 ms 100.0% 2025-09-07T10:55:43.0987634Z triton_mm_1832 0.0492 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.0989796Z triton_mm_1837 0.0502 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.0991943Z triton_mm_1828 0.0522 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:43.0994127Z triton_mm_1838 0.0522 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:43.0996373Z triton_mm_1830 0.0532 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.0998473Z triton_mm_1836 0.0553 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.1000582Z triton_mm_1833 0.0573 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.1003361Z triton_mm_1831 0.0604 ms 66.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:43.1005515Z triton_mm_1834 0.0604 ms 66.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:43.1007329Z SingleProcess AUTOTUNE benchmarking takes 0.3528 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:43.4639717Z Autotune Choices Stats: 2025-09-07T10:55:43.4641235Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.03993599861860275, "best_triton_pos": 1, "best_triton_time": 0.04915200173854828, "best_triton_kernel": "triton_mm_1868", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:43.4750036Z AUTOTUNE mm(2048x2464, 2464x512) 2025-09-07T10:55:43.4750356Z strides: [1, 2048], [512, 1] 2025-09-07T10:55:43.4750654Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:43.4750974Z mm 0.0399 ms 100.0% 2025-09-07T10:55:43.4751696Z triton_mm_1868 0.0492 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.4752940Z triton_mm_1873 0.0502 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.4754148Z triton_mm_1864 0.0522 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:43.4755355Z triton_mm_1866 0.0522 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.4756561Z triton_mm_1874 0.0522 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:43.4757755Z triton_mm_1869 0.0553 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.4759213Z triton_mm_1872 0.0553 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:43.4760429Z triton_mm_1867 0.0604 ms 66.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:43.4761630Z triton_mm_1870 0.0604 ms 66.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:43.4762672Z SingleProcess AUTOTUNE benchmarking takes 0.3515 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:44.1130408Z Autotune Choices Stats: 2025-09-07T10:55:44.1131952Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.02969600073993206, "best_triton_pos": 1, "best_triton_time": 0.03788800165057182, "best_triton_kernel": "triton_mm_1940", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:44.1242194Z AUTOTUNE mm(1536x2464, 2464x512) 2025-09-07T10:55:44.1242518Z strides: [1, 1536], [512, 1] 2025-09-07T10:55:44.1243036Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:44.1243361Z mm 0.0297 ms 100.0% 2025-09-07T10:55:44.1244095Z triton_mm_1940 0.0379 ms 78.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.1245550Z triton_mm_1936 0.0389 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:44.1246774Z triton_mm_1939 0.0399 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:44.1247971Z triton_mm_1942 0.0399 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:44.1249172Z triton_mm_1937 0.0440 ms 67.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:44.1250481Z triton_mm_1938 0.0481 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.1251683Z triton_mm_1941 0.0492 ms 60.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.1252906Z triton_mm_1945 0.0502 ms 59.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.1254110Z triton_mm_1946 0.0522 ms 56.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:44.1255160Z SingleProcess AUTOTUNE benchmarking takes 0.3387 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:44.7464622Z Autotune Choices Stats: 2025-09-07T10:55:44.7466128Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_3575", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:44.7576821Z AUTOTUNE mm(200x768, 768x3072) 2025-09-07T10:55:44.7577135Z strides: [768, 1], [3072, 1] 2025-09-07T10:55:44.7577812Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:44.7578116Z mm 0.0154 ms 100.0% 2025-09-07T10:55:44.7578831Z triton_mm_3575 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.7580038Z triton_mm_3571 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:44.7581245Z triton_mm_3574 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:44.7582443Z triton_mm_3577 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:44.7583642Z triton_mm_3573 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.7584832Z triton_mm_3580 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.7586046Z triton_mm_3581 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:44.7587336Z triton_mm_3572 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:44.7588536Z triton_mm_3576 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:44.7589580Z SingleProcess AUTOTUNE benchmarking takes 0.2735 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:45.5658281Z Autotune Choices Stats: 2025-09-07T10:55:45.5659527Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_3664", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T10:55:45.5776932Z AUTOTUNE mm(768x200, 200x768) 2025-09-07T10:55:45.5777250Z strides: [1, 768], [768, 1] 2025-09-07T10:55:45.5777631Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:45.5778396Z triton_mm_3664 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:45.5779163Z mm 0.0113 ms 90.9% 2025-09-07T10:55:45.5779874Z triton_mm_3661 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:45.5781076Z triton_mm_3663 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:45.5782283Z triton_mm_3665 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:45.5783492Z triton_mm_3666 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:45.5784684Z triton_mm_3667 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:45.5786100Z triton_mm_3656 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:45.5787317Z triton_mm_3669 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:45.5788541Z triton_mm_3655 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:45.5789583Z SingleProcess AUTOTUNE benchmarking takes 0.2508 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:45.9916336Z Autotune Choices Stats: 2025-09-07T10:55:45.9917592Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_3533", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:55:46.0027542Z AUTOTUNE mm(768x4, 4x512) 2025-09-07T10:55:46.0027828Z strides: [1, 768], [512, 1] 2025-09-07T10:55:46.0028127Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:46.0028907Z triton_mm_3533 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.0030136Z triton_mm_3539 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:46.0034956Z triton_mm_3540 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:46.0036175Z triton_mm_3541 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:55:46.0037391Z triton_mm_3543 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:46.0038596Z triton_mm_3531 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:55:46.0039928Z triton_mm_3532 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.0041128Z triton_mm_3534 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.0042333Z triton_mm_3535 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:46.0043803Z triton_mm_3536 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.0044841Z SingleProcess AUTOTUNE benchmarking takes 0.2211 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:55:46.4358214Z Autotune Choices Stats: 2025-09-07T10:55:46.4359482Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1771", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T10:55:46.4472078Z AUTOTUNE mm(512x32, 32x512) 2025-09-07T10:55:46.4472356Z strides: [1, 512], [512, 1] 2025-09-07T10:55:46.4473016Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:46.4473793Z triton_mm_1771 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.4475002Z triton_mm_1773 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:46.4476212Z triton_mm_1775 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.4477419Z triton_mm_1776 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:46.4478611Z triton_mm_1777 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:46.4479370Z mm 0.0082 ms 87.5% 2025-09-07T10:55:46.4480064Z triton_mm_1769 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:55:46.4481253Z triton_mm_1770 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.4482537Z triton_mm_1772 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.4483944Z triton_mm_1774 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.4484965Z SingleProcess AUTOTUNE benchmarking takes 0.2289 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:55:46.7828334Z Autotune Choices Stats: 2025-09-07T10:55:46.7840904Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_mm_1901", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:55:46.7942305Z AUTOTUNE mm(512x2464, 2464x512) 2025-09-07T10:55:46.7942608Z strides: [1, 512], [512, 1] 2025-09-07T10:55:46.7942912Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:46.7943229Z mm 0.0195 ms 100.0% 2025-09-07T10:55:46.7943949Z triton_mm_1901 0.0236 ms 82.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:46.7945167Z triton_mm_1900 0.0297 ms 65.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:46.7946381Z triton_mm_1894 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:46.7947591Z triton_mm_1904 0.0338 ms 57.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:46.7948785Z triton_mm_1895 0.0369 ms 52.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.7950215Z triton_mm_1896 0.0369 ms 52.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:46.7951419Z triton_mm_1903 0.0379 ms 51.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:46.7952694Z triton_mm_1906 0.0379 ms 51.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:46.7953888Z triton_mm_1897 0.0410 ms 47.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:46.7954935Z SingleProcess AUTOTUNE benchmarking takes 0.3230 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:47.1834225Z Autotune Choices Stats: 2025-09-07T10:55:47.1835467Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1790", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:55:47.1948299Z AUTOTUNE mm(32x512, 512x512) 2025-09-07T10:55:47.1948601Z strides: [512, 1], [512, 1] 2025-09-07T10:55:47.1948898Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:47.1949686Z triton_mm_1790 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:47.1951115Z triton_mm_1788 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:47.1952320Z triton_mm_1789 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:47.1953519Z triton_mm_1794 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:47.1954268Z mm 0.0113 ms 81.8% 2025-09-07T10:55:47.1954959Z triton_mm_1787 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:47.1956266Z triton_mm_1793 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:47.1957467Z triton_mm_1797 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:47.1958678Z triton_mm_1798 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:55:47.1959870Z triton_mm_1800 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:47.1960903Z SingleProcess AUTOTUNE benchmarking takes 0.2392 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:55:48.8988567Z Autotune Choices Stats: 2025-09-07T10:55:48.8990088Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.035840000957250595, "best_triton_pos": 1, "best_triton_time": 0.04095999896526337, "best_triton_kernel": "triton_mm_1850", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:48.9105374Z AUTOTUNE mm(2464x2048, 2048x512) 2025-09-07T10:55:48.9105711Z strides: [2048, 1], [512, 1] 2025-09-07T10:55:48.9106014Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:48.9106687Z mm 0.0358 ms 100.0% 2025-09-07T10:55:48.9107418Z triton_mm_1850 0.0410 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:48.9108636Z triton_mm_1856 0.0420 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:48.9109834Z triton_mm_1846 0.0440 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:48.9111033Z triton_mm_1855 0.0451 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:48.9112242Z triton_mm_1847 0.0481 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:48.9113437Z triton_mm_1848 0.0481 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:48.9114639Z triton_mm_1854 0.0502 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:48.9115966Z triton_mm_1851 0.0512 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:48.9117162Z triton_mm_1849 0.0522 ms 68.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:48.9118189Z SingleProcess AUTOTUNE benchmarking takes 0.3385 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:49.1687416Z Autotune Choices Stats: 2025-09-07T10:55:49.1688601Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1892", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:55:49.1799238Z AUTOTUNE mm(2464x512, 512x512) 2025-09-07T10:55:49.1799604Z strides: [512, 1], [512, 1] 2025-09-07T10:55:49.1799900Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:49.1800680Z triton_mm_1892 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:49.1801442Z mm 0.0174 ms 94.1% 2025-09-07T10:55:49.1802152Z triton_mm_1886 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.1803542Z triton_mm_1887 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.1804749Z triton_mm_1891 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.1805938Z triton_mm_1882 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:49.1807134Z triton_mm_1890 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.1808531Z triton_mm_1884 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.1809731Z triton_mm_1885 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:49.1810925Z triton_mm_1888 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:49.1811967Z SingleProcess AUTOTUNE benchmarking takes 0.2674 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:49.4872661Z Autotune Choices Stats: 2025-09-07T10:55:49.4874144Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.02969600073993206, "best_triton_pos": 1, "best_triton_time": 0.03174399957060814, "best_triton_kernel": "triton_mm_1922", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:55:49.4991993Z AUTOTUNE mm(2464x1536, 1536x512) 2025-09-07T10:55:49.4992368Z strides: [1536, 1], [512, 1] 2025-09-07T10:55:49.4992666Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:49.4992974Z mm 0.0297 ms 100.0% 2025-09-07T10:55:49.4993686Z triton_mm_1922 0.0317 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.4995094Z triton_mm_1928 0.0348 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:49.4996297Z triton_mm_1918 0.0358 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:49.4997559Z triton_mm_1927 0.0358 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.4998755Z triton_mm_1920 0.0389 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.5000027Z triton_mm_1921 0.0389 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:49.5001223Z triton_mm_1924 0.0389 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:49.5002421Z triton_mm_1919 0.0399 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:49.5003805Z triton_mm_1926 0.0399 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.5004851Z SingleProcess AUTOTUNE benchmarking takes 0.3187 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:49.8764762Z Autotune Choices Stats: 2025-09-07T10:55:49.8765978Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_3551", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T10:55:49.8885276Z AUTOTUNE mm(4x512, 512x768) 2025-09-07T10:55:49.8885578Z strides: [512, 1], [1, 512] 2025-09-07T10:55:49.8885862Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:49.8886910Z triton_mm_3551 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:55:49.8888135Z triton_mm_3555 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:49.8889346Z triton_mm_3549 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:49.8890543Z triton_mm_3550 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:55:49.8891728Z triton_mm_3554 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.8892917Z triton_mm_3559 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:55:49.8894113Z triton_mm_3563 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:49.8894860Z mm 0.0113 ms 81.8% 2025-09-07T10:55:49.8895552Z triton_mm_3548 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:55:49.8896831Z triton_mm_3558 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:49.8897926Z SingleProcess AUTOTUNE benchmarking takes 0.2365 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:55:50.2165794Z Autotune Choices Stats: 2025-09-07T10:55:50.2166994Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_3604", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T10:55:50.2286952Z AUTOTUNE mm(200x3072, 3072x768) 2025-09-07T10:55:50.2287426Z strides: [3072, 1], [768, 1] 2025-09-07T10:55:50.2287709Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:50.2288493Z triton_mm_3604 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.2289267Z mm 0.0236 ms 87.0% 2025-09-07T10:55:50.2289964Z triton_mm_3608 0.0246 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.2291167Z triton_mm_3602 0.0297 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.2292364Z triton_mm_3603 0.0297 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.2293555Z triton_mm_3601 0.0348 ms 58.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:50.2294734Z triton_mm_3607 0.0348 ms 58.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:50.2295922Z triton_mm_3611 0.0379 ms 54.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:50.2297317Z triton_mm_3610 0.0461 ms 44.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.2298528Z triton_mm_3613 0.0492 ms 41.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.2299571Z SingleProcess AUTOTUNE benchmarking takes 0.3383 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:50.4783801Z Autotune Choices Stats: 2025-09-07T10:55:50.4785290Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_mm_3640", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:55:50.4903114Z AUTOTUNE mm(200x768, 768x768) 2025-09-07T10:55:50.4903445Z strides: [768, 1], [768, 1] 2025-09-07T10:55:50.4903746Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:50.4904044Z mm 0.0123 ms 100.0% 2025-09-07T10:55:50.4904763Z triton_mm_3640 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.4905982Z triton_mm_3644 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.4907462Z triton_mm_3638 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.4908671Z triton_mm_3639 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.4909862Z triton_mm_3643 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:50.4911037Z triton_mm_3637 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:50.4912331Z triton_mm_3647 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:50.4913533Z triton_mm_3646 0.0164 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.4914737Z triton_mm_3649 0.0174 ms 70.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.4915778Z SingleProcess AUTOTUNE benchmarking takes 0.2609 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:55:50.7909036Z Autotune Choices Stats: 2025-09-07T10:55:50.7910501Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.018432000651955605, "best_triton_kernel": "triton_mm_3694", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:55:50.8027538Z AUTOTUNE mm(200x2304, 2304x768) 2025-09-07T10:55:50.8027874Z strides: [2304, 1], [768, 1] 2025-09-07T10:55:50.8028175Z dtypes: torch.float16, torch.float16 2025-09-07T10:55:50.8028483Z mm 0.0184 ms 100.0% 2025-09-07T10:55:50.8029462Z triton_mm_3694 0.0184 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.8030663Z triton_mm_3698 0.0205 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:55:50.8031854Z triton_mm_3692 0.0246 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.8033056Z triton_mm_3693 0.0246 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:55:50.8034242Z triton_mm_3697 0.0276 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:55:50.8035435Z triton_mm_3691 0.0287 ms 64.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:55:50.8036633Z triton_mm_3701 0.0297 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:55:50.8037838Z triton_mm_3700 0.0358 ms 51.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.8039103Z triton_mm_3703 0.0389 ms 47.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:55:50.8040140Z SingleProcess AUTOTUNE benchmarking takes 0.3106 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:55:59.0101919Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T10:55:59.0103178Z pred = mod(*cloned_inputs) 2025-09-07T10:55:59.0103839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/model.py", line 72, in forward 2025-09-07T10:55:59.0104804Z embeddings_b = self.encoder_b(features_b) 2025-09-07T10:55:59.0105528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchmultimodal/models/clip/text_encoder.py", line 132, in forward 2025-09-07T10:55:59.0106366Z hidden_state[torch.arange(hidden_state.shape[0]), text.argmax(dim=-1)] 2025-09-07T10:55:59.0106715Z 2025-09-07T10:55:59.0106720Z 2025-09-07T10:56:00.5074224Z W0907 10:56:00.506000 115884 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:56:48.9260851Z pass 2025-09-07T10:56:55.4822975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:56:55.4824465Z import pynvml # type: ignore[import] 2025-09-07T10:56:58.2332351Z 2025-09-07T10:56:59.1613506Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:56:59.1614034Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:56:59.1614528Z cuda train tts_angular 2025-09-07T10:57:04.0533144Z W0907 10:57:04.052000 123476 site-packages/torch/_logging/_internal.py:1199] [10/0] Profiler function will be ignored 2025-09-07T10:57:06.4800343Z pass 2025-09-07T10:57:09.3782022Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:57:09.3783590Z import pynvml # type: ignore[import] 2025-09-07T10:57:12.0762244Z 2025-09-07T10:57:15.1610488Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:57:15.1610866Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:57:15.1611187Z cuda train vgg16 2025-09-07T10:57:26.6505761Z Autotune Choices Stats: 2025-09-07T10:57:26.6507013Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_98", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.16793599724769592, "best_triton_pos": 0} 2025-09-07T10:57:26.6637167Z AUTOTUNE mm(4x25088, 25088x4096) 2025-09-07T10:57:26.6637800Z strides: [25088, 1], [1, 25088] 2025-09-07T10:57:26.6638213Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:26.6639478Z triton_mm_98 0.1679 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:26.6640698Z triton_mm_94 0.1710 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:26.6641446Z mm 0.1761 ms 95.3% 2025-09-07T10:57:26.6642156Z triton_mm_102 0.1761 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:26.6643882Z triton_mm_106 0.1925 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:26.6645089Z triton_mm_91 0.2324 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:57:26.6646270Z triton_mm_97 0.2324 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:26.6647457Z triton_mm_93 0.2355 ms 71.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:26.6648832Z triton_mm_92 0.2427 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:26.6650005Z triton_mm_101 0.2488 ms 67.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:26.6651050Z SingleProcess AUTOTUNE benchmarking takes 0.6618 seconds and 0.5894 seconds precompiling for 18 choices 2025-09-07T10:57:27.3541857Z Autotune Choices Stats: 2025-09-07T10:57:27.3543095Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_111", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.043007999658584595, "best_triton_pos": 0} 2025-09-07T10:57:27.3666688Z AUTOTUNE mm(4x4096, 4096x4096) 2025-09-07T10:57:27.3667001Z strides: [4096, 1], [1, 4096] 2025-09-07T10:57:27.3667305Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:27.3668073Z triton_mm_111 0.0430 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:27.3669280Z triton_mm_115 0.0430 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:27.3670836Z triton_mm_119 0.0451 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:27.3671594Z mm 0.0461 ms 93.3% 2025-09-07T10:57:27.3672273Z triton_mm_123 0.0471 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:27.3673462Z triton_mm_114 0.0512 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:27.3674652Z triton_mm_108 0.0522 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:57:27.3675843Z triton_mm_118 0.0543 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:27.3677025Z triton_mm_109 0.0553 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:27.3678204Z triton_mm_110 0.0553 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:27.3679229Z SingleProcess AUTOTUNE benchmarking takes 0.3572 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:57:27.9970543Z Autotune Choices Stats: 2025-09-07T10:57:27.9972238Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.04608000069856644, "best_triton_pos": 1, "best_triton_time": 0.08806400001049042, "best_triton_kernel": "triton_convolution2d_0", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:57:28.0094687Z AUTOTUNE convolution(4x3x224x224, 64x3x3x3) 2025-09-07T10:57:28.0095057Z strides: [150528, 1, 672, 3], [27, 1, 9, 3] 2025-09-07T10:57:28.0095393Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:28.0095701Z convolution 0.0461 ms 100.0% 2025-09-07T10:57:28.0096826Z triton_convolution2d_0 0.0881 ms 52.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.0098376Z triton_convolution2d_3 0.0901 ms 51.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.0099867Z triton_convolution2d_4 0.0932 ms 49.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.0101349Z triton_convolution2d_2 0.1178 ms 39.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:28.0102827Z triton_convolution2d_5 0.1208 ms 38.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.0104302Z triton_convolution2d_1 0.1731 ms 26.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.0105473Z SingleProcess AUTOTUNE benchmarking takes 0.2291 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:57:28.2518778Z Autotune Choices Stats: 2025-09-07T10:57:28.2520422Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.10342399775981903, "best_triton_pos": 1, "best_triton_time": 0.1372160017490387, "best_triton_kernel": "triton_convolution2d_12", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:57:28.2636528Z AUTOTUNE convolution(4x64x224x224, 64x64x3x3) 2025-09-07T10:57:28.2636957Z strides: [3211264, 1, 14336, 64], [576, 1, 192, 64] 2025-09-07T10:57:28.2637328Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:28.2637638Z convolution 0.1034 ms 100.0% 2025-09-07T10:57:28.2638532Z triton_convolution2d_12 0.1372 ms 75.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.2640028Z triton_convolution2d_9 0.1485 ms 69.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.2641504Z triton_convolution2d_10 0.1505 ms 68.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.2643157Z triton_convolution2d_7 0.1679 ms 61.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.2644761Z triton_convolution2d_11 0.1710 ms 60.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.2646241Z triton_convolution2d_6 0.2806 ms 36.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.2647707Z triton_convolution2d_8 0.5171 ms 20.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:28.2648962Z SingleProcess AUTOTUNE benchmarking takes 0.2533 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:28.4494160Z Autotune Choices Stats: 2025-09-07T10:57:28.4495819Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.050175998359918594, "best_triton_pos": 1, "best_triton_time": 0.062463998794555664, "best_triton_kernel": "triton_convolution2d_16", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:57:28.4612760Z AUTOTUNE convolution(4x64x112x112, 128x64x3x3) 2025-09-07T10:57:28.4613234Z strides: [802816, 1, 7168, 64], [576, 1, 192, 64] 2025-09-07T10:57:28.4613597Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:28.4613906Z convolution 0.0502 ms 100.0% 2025-09-07T10:57:28.4614798Z triton_convolution2d_16 0.0625 ms 80.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.4616305Z triton_convolution2d_19 0.0686 ms 73.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.4618058Z triton_convolution2d_18 0.0696 ms 72.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.4619551Z triton_convolution2d_14 0.0881 ms 57.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.4621038Z triton_convolution2d_17 0.0922 ms 54.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.4622518Z triton_convolution2d_13 0.1065 ms 47.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.4623999Z triton_convolution2d_15 0.2417 ms 20.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:28.4625173Z SingleProcess AUTOTUNE benchmarking takes 0.1968 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:28.6891361Z Autotune Choices Stats: 2025-09-07T10:57:28.6892999Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.07577600330114365, "best_triton_pos": 1, "best_triton_time": 0.11366400122642517, "best_triton_kernel": "triton_convolution2d_23", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:57:28.7009357Z AUTOTUNE convolution(4x128x112x112, 128x128x3x3) 2025-09-07T10:57:28.7009767Z strides: [1605632, 1, 14336, 128], [1152, 1, 384, 128] 2025-09-07T10:57:28.7010138Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:28.7010442Z convolution 0.0758 ms 100.0% 2025-09-07T10:57:28.7011338Z triton_convolution2d_23 0.1137 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.7012834Z triton_convolution2d_26 0.1280 ms 59.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.7014423Z triton_convolution2d_24 0.1454 ms 52.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.7015906Z triton_convolution2d_25 0.1495 ms 50.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.7017463Z triton_convolution2d_21 0.1608 ms 47.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.7018952Z triton_convolution2d_20 0.2079 ms 36.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.7020431Z triton_convolution2d_22 0.4792 ms 15.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:28.7021609Z SingleProcess AUTOTUNE benchmarking takes 0.2389 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:28.8810165Z Autotune Choices Stats: 2025-09-07T10:57:28.8812070Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.058368001133203506, "best_triton_kernel": "triton_convolution2d_30", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:57:28.8929383Z AUTOTUNE convolution(4x128x56x56, 256x128x3x3) 2025-09-07T10:57:28.8929786Z strides: [401408, 1, 7168, 128], [1152, 1, 384, 128] 2025-09-07T10:57:28.8930153Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:28.8930467Z convolution 0.0420 ms 100.0% 2025-09-07T10:57:28.8931359Z triton_convolution2d_30 0.0584 ms 71.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.8932865Z triton_convolution2d_32 0.0645 ms 65.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.8934346Z triton_convolution2d_33 0.0645 ms 65.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:28.8935823Z triton_convolution2d_31 0.0778 ms 53.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.8937466Z triton_convolution2d_28 0.0809 ms 51.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.8938959Z triton_convolution2d_27 0.1116 ms 37.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:28.8940430Z triton_convolution2d_29 0.2365 ms 17.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:28.8941616Z SingleProcess AUTOTUNE benchmarking takes 0.1912 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:29.1104924Z Autotune Choices Stats: 2025-09-07T10:57:29.1106565Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06860800087451935, "best_triton_pos": 1, "best_triton_time": 0.10547199845314026, "best_triton_kernel": "triton_convolution2d_37", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:57:29.1224711Z AUTOTUNE convolution(4x256x56x56, 256x256x3x3) 2025-09-07T10:57:29.1225147Z strides: [802816, 1, 14336, 256], [2304, 1, 768, 256] 2025-09-07T10:57:29.1225506Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:29.1225829Z convolution 0.0686 ms 100.0% 2025-09-07T10:57:29.1226741Z triton_convolution2d_37 0.1055 ms 65.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.1228255Z triton_convolution2d_39 0.1198 ms 57.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.1229746Z triton_convolution2d_40 0.1249 ms 54.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.1231560Z triton_convolution2d_38 0.1372 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.1233042Z triton_convolution2d_35 0.1485 ms 46.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.1234532Z triton_convolution2d_34 0.1628 ms 42.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.1236029Z triton_convolution2d_36 0.4608 ms 14.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:29.1237223Z SingleProcess AUTOTUNE benchmarking takes 0.2287 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:29.3285465Z Autotune Choices Stats: 2025-09-07T10:57:29.3287089Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.07782399654388428, "best_triton_kernel": "triton_convolution2d_52", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:57:29.3406878Z AUTOTUNE convolution(4x256x28x28, 512x256x3x3) 2025-09-07T10:57:29.3407273Z strides: [200704, 1, 7168, 256], [2304, 1, 768, 256] 2025-09-07T10:57:29.3407626Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:29.3407944Z convolution 0.0420 ms 100.0% 2025-09-07T10:57:29.3408838Z triton_convolution2d_52 0.0778 ms 53.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.3410331Z triton_convolution2d_51 0.1004 ms 41.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.3411822Z triton_convolution2d_54 0.1085 ms 38.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.3413449Z triton_convolution2d_53 0.1096 ms 38.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.3414935Z triton_convolution2d_49 0.1280 ms 32.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.3416424Z triton_convolution2d_48 0.1434 ms 29.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.3417986Z triton_convolution2d_50 0.3983 ms 10.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:29.3419166Z SingleProcess AUTOTUNE benchmarking takes 0.2139 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:29.6061421Z Autotune Choices Stats: 2025-09-07T10:57:29.6063267Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06860800087451935, "best_triton_pos": 1, "best_triton_time": 0.14745600521564484, "best_triton_kernel": "triton_convolution2d_59", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:57:29.6180213Z AUTOTUNE convolution(4x512x28x28, 512x512x3x3) 2025-09-07T10:57:29.6180647Z strides: [401408, 1, 14336, 512], [4608, 1, 1536, 512] 2025-09-07T10:57:29.6181036Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:29.6181374Z convolution 0.0686 ms 100.0% 2025-09-07T10:57:29.6182251Z triton_convolution2d_59 0.1475 ms 46.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.6183765Z triton_convolution2d_58 0.1935 ms 35.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.6185283Z triton_convolution2d_60 0.2079 ms 33.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.6186780Z triton_convolution2d_61 0.2079 ms 33.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.6188270Z triton_convolution2d_56 0.2396 ms 28.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.6191598Z triton_convolution2d_55 0.2826 ms 24.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.6193094Z triton_convolution2d_57 0.7547 ms 9.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:29.6194273Z SingleProcess AUTOTUNE benchmarking takes 0.2765 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:29.8827308Z Autotune Choices Stats: 2025-09-07T10:57:29.8828934Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.030719999223947525, "best_triton_pos": 1, "best_triton_time": 0.130048006772995, "best_triton_kernel": "triton_convolution2d_73", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:57:29.8953149Z AUTOTUNE convolution(4x512x14x14, 512x512x3x3) 2025-09-07T10:57:29.8953555Z strides: [100352, 1, 7168, 512], [4608, 1, 1536, 512] 2025-09-07T10:57:29.8953925Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:29.8954258Z convolution 0.0307 ms 100.0% 2025-09-07T10:57:29.8955129Z triton_convolution2d_73 0.1300 ms 23.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.8956629Z triton_convolution2d_72 0.1915 ms 16.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.8958129Z triton_convolution2d_74 0.2028 ms 15.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.8959797Z triton_convolution2d_75 0.2068 ms 14.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:57:29.8961284Z triton_convolution2d_70 0.2406 ms 12.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.8962769Z triton_convolution2d_69 0.2673 ms 11.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:57:29.8964480Z triton_convolution2d_71 0.4198 ms 7.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:57:29.8965655Z SingleProcess AUTOTUNE benchmarking takes 0.2730 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:57:30.2852851Z Autotune Choices Stats: 2025-09-07T10:57:30.2854067Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_128", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.02252800017595291, "best_triton_pos": 0} 2025-09-07T10:57:30.2978851Z AUTOTUNE addmm(4x1000, 4x4096, 4096x1000) 2025-09-07T10:57:30.2979229Z strides: [0, 1], [4096, 1], [1, 4096] 2025-09-07T10:57:30.2979602Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:57:30.2980413Z triton_mm_128 0.0225 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:30.2981868Z triton_mm_132 0.0225 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:30.2982632Z bias_addmm 0.0236 ms 95.7% 2025-09-07T10:57:30.2982924Z addmm 0.0266 ms 84.6% 2025-09-07T10:57:30.2983639Z triton_mm_136 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:30.2984830Z triton_mm_140 0.0317 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:30.2986105Z triton_mm_127 0.0358 ms 62.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:30.2987287Z triton_mm_126 0.0369 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:30.2988479Z triton_mm_125 0.0379 ms 59.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:57:30.2989660Z triton_mm_131 0.0379 ms 59.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:30.2990686Z SingleProcess AUTOTUNE benchmarking takes 0.3946 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:57:37.5151362Z Autotune Choices Stats: 2025-09-07T10:57:37.5152560Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_224", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.1443839967250824, "best_triton_pos": 0} 2025-09-07T10:57:37.5400691Z AUTOTUNE mm(4096x4, 4x25088) 2025-09-07T10:57:37.5400987Z strides: [1, 4096], [25088, 1] 2025-09-07T10:57:37.5401278Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:37.5402405Z triton_mm_224 0.1444 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:57:37.5403839Z triton_mm_232 0.1475 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:37.5405036Z triton_mm_233 0.1475 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:37.5406246Z triton_mm_234 0.1475 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:37.5407436Z triton_mm_226 0.1485 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:37.5408609Z triton_mm_225 0.1495 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:37.5409810Z triton_mm_237 0.1495 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:57:37.5411026Z triton_mm_238 0.1495 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:37.5412333Z triton_mm_239 0.1495 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:37.5413531Z triton_mm_229 0.1526 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:37.5414559Z SingleProcess AUTOTUNE benchmarking takes 0.4893 seconds and 0.0004 seconds precompiling for 17 choices 2025-09-07T10:57:38.0869374Z Autotune Choices Stats: 2025-09-07T10:57:38.0870575Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_192", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.028672000393271446, "best_triton_pos": 0} 2025-09-07T10:57:38.1197879Z AUTOTUNE mm(4096x4, 4x4096) 2025-09-07T10:57:38.1198165Z strides: [1, 4096], [4096, 1] 2025-09-07T10:57:38.1198469Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:38.1199239Z triton_mm_192 0.0287 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:38.1200456Z triton_mm_193 0.0287 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:38.1201649Z triton_mm_195 0.0287 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:38.1203012Z triton_mm_191 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:57:38.1204213Z triton_mm_196 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:38.1205400Z triton_mm_197 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:57:38.1206812Z triton_mm_198 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:38.1208009Z triton_mm_200 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:38.1209201Z triton_mm_201 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:38.1210397Z triton_mm_203 0.0297 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:38.1211419Z SingleProcess AUTOTUNE benchmarking takes 0.2772 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:57:38.5759297Z Autotune Choices Stats: 2025-09-07T10:57:38.5760514Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_158", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T10:57:38.5934204Z AUTOTUNE mm(1000x4, 4x4096) 2025-09-07T10:57:38.5934513Z strides: [1, 1000], [4096, 1] 2025-09-07T10:57:38.5934824Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:38.5935591Z triton_mm_158 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:57:38.5936994Z triton_mm_159 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:38.5938290Z triton_mm_160 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:38.5939489Z triton_mm_162 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:38.5940685Z triton_mm_163 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:57:38.5941989Z triton_mm_164 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:57:38.5943171Z triton_mm_165 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:38.5944380Z triton_mm_166 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:38.5945594Z triton_mm_167 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:38.5946795Z triton_mm_168 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:38.5947838Z SingleProcess AUTOTUNE benchmarking takes 0.2345 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:57:39.0610199Z Autotune Choices Stats: 2025-09-07T10:57:39.0611667Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_143", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T10:57:39.0877359Z AUTOTUNE mm(4x1000, 1000x4096) 2025-09-07T10:57:39.0877684Z strides: [1000, 1], [4096, 1] 2025-09-07T10:57:39.0877989Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:39.0878747Z triton_mm_143 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:39.0879979Z triton_mm_149 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:39.0881201Z triton_mm_153 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:39.0882416Z triton_mm_157 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:39.0883386Z mm 0.0174 ms 94.1% 2025-09-07T10:57:39.0884085Z triton_mm_148 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:39.0885273Z triton_mm_152 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:39.0886609Z triton_mm_151 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:39.0887803Z triton_mm_155 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:39.0889004Z triton_mm_145 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:39.0890041Z SingleProcess AUTOTUNE benchmarking takes 0.2742 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:57:39.7009132Z Autotune Choices Stats: 2025-09-07T10:57:39.7010324Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_182", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.043007999658584595, "best_triton_pos": 0} 2025-09-07T10:57:39.7358693Z AUTOTUNE mm(4x4096, 4096x4096) 2025-09-07T10:57:39.7359017Z strides: [4096, 1], [4096, 1] 2025-09-07T10:57:39.7359320Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:39.7360089Z triton_mm_182 0.0430 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:39.7360853Z mm 0.0451 ms 95.5% 2025-09-07T10:57:39.7361559Z triton_mm_186 0.0451 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:39.7362758Z triton_mm_190 0.0461 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:39.7364139Z triton_mm_176 0.0481 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:39.7365333Z triton_mm_178 0.0492 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:39.7366747Z triton_mm_181 0.0502 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:39.7367943Z triton_mm_185 0.0532 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:39.7369126Z triton_mm_188 0.0532 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:39.7370306Z triton_mm_177 0.0543 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:57:39.7371327Z SingleProcess AUTOTUNE benchmarking takes 0.4159 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:57:41.0024429Z Autotune Choices Stats: 2025-09-07T10:57:41.0026737Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_218", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.16486400365829468, "best_triton_pos": 0} 2025-09-07T10:57:41.0276780Z AUTOTUNE mm(4x4096, 4096x25088) 2025-09-07T10:57:41.0277356Z strides: [4096, 1], [25088, 1] 2025-09-07T10:57:41.0277886Z dtypes: torch.float16, torch.float16 2025-09-07T10:57:41.0279371Z triton_mm_218 0.1649 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:41.0282031Z triton_mm_223 0.1649 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:57:41.0284646Z triton_mm_214 0.1659 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:41.0286947Z triton_mm_220 0.1659 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:41.0289265Z triton_mm_216 0.1669 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:57:41.0291716Z triton_mm_217 0.1669 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:57:41.0293986Z triton_mm_219 0.1669 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:41.0296302Z triton_mm_209 0.1679 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:57:41.0298656Z triton_mm_221 0.1679 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:57:41.0300084Z mm 0.1690 ms 97.6% 2025-09-07T10:57:41.0301069Z SingleProcess AUTOTUNE benchmarking takes 0.5755 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T10:57:46.0323697Z pass 2025-09-07T10:57:49.2348783Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:57:49.2350219Z import pynvml # type: ignore[import] 2025-09-07T10:57:51.9358593Z 2025-09-07T10:58:07.6045539Z loading model: 0it [00:00, ?it/s]skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:07.6046979Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:07.6047874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:07.6048566Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:07.6049208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:07.6049838Z return input[ 2025-09-07T10:58:07.6049981Z 2025-09-07T10:58:07.6049986Z 2025-09-07T10:58:20.1960671Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:20.1962020Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:20.1963157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:20.1963863Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:20.1964506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:20.1965143Z return input[ 2025-09-07T10:58:20.1965603Z 2025-09-07T10:58:20.1965609Z 2025-09-07T10:58:29.1366207Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:29.1367547Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:29.1368474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:29.1369175Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:29.1369798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:29.1370428Z return input[ 2025-09-07T10:58:29.1370585Z 2025-09-07T10:58:29.1370953Z 2025-09-07T10:58:37.6483478Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:37.6484837Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:37.6485740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:37.6486447Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:37.6487089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:37.6487728Z return input[ 2025-09-07T10:58:37.6487889Z 2025-09-07T10:58:37.6487894Z 2025-09-07T10:58:48.1976017Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:48.1977350Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:48.1978312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:48.1979010Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:48.1979633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:48.1984232Z return input[ 2025-09-07T10:58:48.1984596Z 2025-09-07T10:58:48.1984602Z 2025-09-07T10:58:58.3275064Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:58:58.3276380Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:58:58.3277300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:58:58.3278001Z v4 = masked_index(y_high, x_high) 2025-09-07T10:58:58.3278641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:58:58.3279269Z return input[ 2025-09-07T10:58:58.3279413Z 2025-09-07T10:58:58.3279418Z 2025-09-07T10:59:11.5931657Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:59:11.5933608Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:59:11.5934518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:59:11.5935218Z v4 = masked_index(y_high, x_high) 2025-09-07T10:59:11.5936152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:59:11.5936806Z return input[ 2025-09-07T10:59:11.5936950Z 2025-09-07T10:59:11.5936955Z 2025-09-07T10:59:16.1119974Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T10:59:16.1121313Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T10:59:16.1122201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T10:59:16.1123084Z v4 = masked_index(y_high, x_high) 2025-09-07T10:59:16.1123727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T10:59:16.1124632Z return input[ 2025-09-07T10:59:16.1124784Z 2025-09-07T10:59:16.1124789Z 2025-09-07T10:59:17.7019300Z 2025-09-07T10:59:17.7019715Z loading model: 0it [01:25, ?it/s] 2025-09-07T10:59:17.7020131Z cuda train vision_maskrcnn 2025-09-07T10:59:17.8546744Z W0907 10:59:17.854000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T10:59:17.8548058Z W0907 10:59:17.854000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] function: '_roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:114) 2025-09-07T10:59:17.8549410Z W0907 10:59:17.854000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] last reason: 0/7: tensor 'rois' dtype mismatch. expected Float, actual Double 2025-09-07T10:59:17.8550579Z W0907 10:59:17.854000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T10:59:17.8551915Z W0907 10:59:17.854000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T10:59:39.2856509Z Autotune Choices Stats: 2025-09-07T10:59:39.2858241Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_57", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.03686400130391121, "best_triton_pos": 0} 2025-09-07T10:59:39.2991722Z AUTOTUNE mm(60800x64, 64x256) 2025-09-07T10:59:39.2992246Z strides: [64, 1], [1, 64] 2025-09-07T10:59:39.2992616Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:39.2995581Z triton_mm_57 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:39.2996930Z triton_mm_58 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:39.2998252Z triton_mm_61 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:39.2999556Z triton_mm_62 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:59:39.3000857Z triton_mm_63 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:39.3002124Z triton_mm_64 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:39.3003751Z triton_mm_66 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:39.3005301Z triton_mm_68 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:39.3006582Z triton_mm_59 0.0389 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:39.3007866Z triton_mm_67 0.0389 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:39.3008977Z SingleProcess AUTOTUNE benchmarking takes 0.4283 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:59:42.4228223Z Autotune Choices Stats: 2025-09-07T10:59:42.4229928Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.2734079957008362, "best_triton_pos": 1, "best_triton_time": 0.5283839702606201, "best_triton_kernel": "triton_convolution2d_833", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:59:42.4354795Z AUTOTUNE convolution(1x256x200x304, 256x256x3x3) 2025-09-07T10:59:42.4355229Z strides: [15564800, 1, 77824, 256], [2304, 1, 768, 256] 2025-09-07T10:59:42.4355608Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:42.4355930Z convolution 0.2734 ms 100.0% 2025-09-07T10:59:42.4356824Z triton_convolution2d_833 0.5284 ms 51.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:42.4358340Z triton_convolution2d_835 0.5734 ms 47.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:42.4359835Z triton_convolution2d_834 0.5960 ms 45.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:42.4361667Z triton_convolution2d_836 0.6298 ms 43.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:42.4363361Z triton_convolution2d_830 0.7506 ms 36.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:42.4364862Z triton_convolution2d_831 0.7547 ms 36.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:42.4366368Z triton_convolution2d_832 2.5692 ms 10.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:59:42.4367550Z SingleProcess AUTOTUNE benchmarking takes 0.3267 seconds and 1.7310 seconds precompiling for 8 choices 2025-09-07T10:59:44.1304221Z Autotune Choices Stats: 2025-09-07T10:59:44.1305453Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_828", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.0727040022611618, "best_triton_pos": 0} 2025-09-07T10:59:44.1432252Z AUTOTUNE addmm(60800x256, 60800x256, 256x256) 2025-09-07T10:59:44.1432900Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T10:59:44.1433258Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:59:44.1434066Z triton_mm_828 0.0727 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.1435293Z triton_mm_827 0.0737 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.1436482Z triton_mm_824 0.0748 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.1437666Z triton_mm_821 0.0758 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.1438964Z triton_mm_825 0.0758 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:44.1439721Z bias_addmm 0.0768 ms 94.7% 2025-09-07T10:59:44.1440442Z triton_mm_822 0.0768 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:44.1441629Z triton_mm_823 0.0778 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.1443020Z triton_mm_829 0.0840 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:44.1444213Z triton_mm_819 0.0901 ms 80.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:44.1445246Z SingleProcess AUTOTUNE benchmarking takes 0.4891 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:59:44.8583019Z Autotune Choices Stats: 2025-09-07T10:59:44.8584650Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_168", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.050175998359918594, "best_triton_pos": 0} 2025-09-07T10:59:44.8707858Z AUTOTUNE mm(60800x256, 256x128) 2025-09-07T10:59:44.8708246Z strides: [256, 1], [1, 256] 2025-09-07T10:59:44.8708546Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:44.8709337Z triton_mm_168 0.0502 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.8710558Z triton_mm_173 0.0502 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.8711320Z mm 0.0512 ms 98.0% 2025-09-07T10:59:44.8712019Z triton_mm_166 0.0512 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.8713222Z triton_mm_169 0.0512 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.8714408Z triton_mm_167 0.0522 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:44.8715593Z triton_mm_170 0.0522 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:44.8716951Z triton_mm_172 0.0522 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:44.8718163Z triton_mm_174 0.0522 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:44.8719342Z triton_mm_164 0.0532 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:44.8720370Z SingleProcess AUTOTUNE benchmarking takes 0.3572 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:45.4601018Z Autotune Choices Stats: 2025-09-07T10:59:45.4602535Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_241", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.025599999353289604, "best_triton_pos": 0} 2025-09-07T10:59:45.4725348Z AUTOTUNE mm(15200x128, 128x512) 2025-09-07T10:59:45.4725644Z strides: [128, 1], [1, 128] 2025-09-07T10:59:45.4725945Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:45.4726733Z triton_mm_241 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:45.4727936Z triton_mm_243 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:45.4729132Z triton_mm_244 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:45.4730327Z triton_mm_248 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:45.4731537Z triton_mm_242 0.0266 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:45.4733010Z triton_mm_245 0.0266 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:45.4734206Z triton_mm_238 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:45.4735401Z triton_mm_247 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:45.4736163Z mm 0.0287 ms 89.3% 2025-09-07T10:59:45.4736847Z triton_mm_246 0.0297 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:59:45.4737980Z SingleProcess AUTOTUNE benchmarking takes 0.3026 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:47.0983533Z Autotune Choices Stats: 2025-09-07T10:59:47.0984761Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_14", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T10:59:47.1116870Z AUTOTUNE mm(60800x64, 64x64) 2025-09-07T10:59:47.1117173Z strides: [64, 1], [1, 64] 2025-09-07T10:59:47.1117472Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:47.1118235Z triton_mm_14 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:47.1119712Z triton_mm_17 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.1120901Z triton_mm_18 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:59:47.1122094Z triton_mm_23 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:47.1123483Z triton_mm_13 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:47.1124789Z triton_mm_22 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.1125970Z triton_mm_7 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:47.1127148Z triton_mm_10 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:47.1128311Z triton_mm_12 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:47.1129479Z triton_mm_15 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.1130497Z SingleProcess AUTOTUNE benchmarking takes 0.2810 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:47.7657620Z Autotune Choices Stats: 2025-09-07T10:59:47.7659237Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_85", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.043007999658584595, "best_triton_pos": 0} 2025-09-07T10:59:47.7784354Z AUTOTUNE mm(60800x256, 256x64) 2025-09-07T10:59:47.7784652Z strides: [256, 1], [1, 256] 2025-09-07T10:59:47.7784946Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:47.7785717Z triton_mm_85 0.0430 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.7786911Z triton_mm_76 0.0440 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:47.7788083Z triton_mm_80 0.0440 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.7789256Z triton_mm_78 0.0451 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.7790429Z triton_mm_86 0.0451 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:47.7791613Z triton_mm_79 0.0461 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:47.7792797Z triton_mm_82 0.0461 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:47.7794150Z triton_mm_83 0.0461 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:47.7794894Z mm 0.0471 ms 91.3% 2025-09-07T10:59:47.7795567Z triton_mm_75 0.0471 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:47.7796593Z SingleProcess AUTOTUNE benchmarking takes 0.3249 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:48.1622132Z Autotune Choices Stats: 2025-09-07T10:59:48.1623361Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_347", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.03379200026392937, "best_triton_pos": 0} 2025-09-07T10:59:48.1748122Z AUTOTUNE mm(15200x512, 512x256) 2025-09-07T10:59:48.1748488Z strides: [512, 1], [1, 512] 2025-09-07T10:59:48.1748788Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:48.1749578Z triton_mm_347 0.0338 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.1750809Z triton_mm_352 0.0369 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.1751558Z mm 0.0389 ms 86.8% 2025-09-07T10:59:48.1752238Z triton_mm_346 0.0399 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:48.1753429Z triton_mm_353 0.0399 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:48.1754624Z triton_mm_343 0.0410 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:48.1758762Z triton_mm_345 0.0410 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.1760021Z triton_mm_348 0.0410 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.1761211Z triton_mm_349 0.0410 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:48.1762392Z triton_mm_351 0.0440 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.1763675Z SingleProcess AUTOTUNE benchmarking takes 0.3321 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:48.7554464Z Autotune Choices Stats: 2025-09-07T10:59:48.7556016Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.021503999829292297, "best_triton_pos": 1, "best_triton_time": 0.021503999829292297, "best_triton_kernel": "triton_mm_423", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:48.7681868Z AUTOTUNE mm(3800x256, 256x1024) 2025-09-07T10:59:48.7682181Z strides: [256, 1], [1, 256] 2025-09-07T10:59:48.7682474Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:48.7682786Z mm 0.0215 ms 100.0% 2025-09-07T10:59:48.7683969Z triton_mm_423 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.7685180Z triton_mm_420 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.7686380Z triton_mm_422 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.7687576Z triton_mm_426 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.7688757Z triton_mm_427 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:48.7690057Z triton_mm_421 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:48.7691249Z triton_mm_424 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:48.7692443Z triton_mm_417 0.0256 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:48.7693620Z triton_mm_418 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:48.7694651Z SingleProcess AUTOTUNE benchmarking takes 0.2992 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:50.0676116Z Autotune Choices Stats: 2025-09-07T10:59:50.0677859Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.09216000139713287, "best_triton_pos": 1, "best_triton_time": 0.16383999586105347, "best_triton_kernel": "triton_convolution2d_809", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:59:50.0813420Z AUTOTUNE convolution(1x256x100x152, 256x256x3x3) 2025-09-07T10:59:50.0813980Z strides: [3891200, 1, 38912, 256], [2304, 1, 768, 256] 2025-09-07T10:59:50.0814343Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:50.0814668Z convolution 0.0922 ms 100.0% 2025-09-07T10:59:50.0815554Z triton_convolution2d_809 0.1638 ms 56.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:50.0817064Z triton_convolution2d_808 0.1894 ms 48.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:50.0818681Z triton_convolution2d_811 0.2038 ms 45.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:50.0820183Z triton_convolution2d_810 0.2058 ms 44.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:50.0821660Z triton_convolution2d_806 0.2570 ms 35.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:50.0823153Z triton_convolution2d_805 0.2703 ms 34.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:50.0824800Z triton_convolution2d_807 0.6666 ms 13.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:59:50.0825986Z SingleProcess AUTOTUNE benchmarking takes 0.2828 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T10:59:51.6022787Z Autotune Choices Stats: 2025-09-07T10:59:51.6024332Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.03891199827194214, "best_triton_pos": 1, "best_triton_time": 0.03891199827194214, "best_triton_kernel": "triton_mm_798", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:51.6153645Z AUTOTUNE addmm(15200x256, 15200x512, 512x256) 2025-09-07T10:59:51.6154010Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:59:51.6154371Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:59:51.6154731Z bias_addmm 0.0389 ms 100.0% 2025-09-07T10:59:51.6155489Z triton_mm_798 0.0389 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:51.6156699Z triton_mm_803 0.0389 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:51.6157906Z triton_mm_796 0.0410 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:51.6159094Z triton_mm_797 0.0410 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:51.6160285Z triton_mm_799 0.0410 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:51.6161696Z triton_mm_794 0.0430 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:51.6163094Z triton_mm_800 0.0430 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:51.6164305Z triton_mm_804 0.0430 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:51.6165512Z triton_mm_802 0.0451 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:51.6174905Z SingleProcess AUTOTUNE benchmarking takes 0.3689 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:59:52.2419993Z Autotune Choices Stats: 2025-09-07T10:59:52.2421203Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_218", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.030719999223947525, "best_triton_pos": 0} 2025-09-07T10:59:52.2549899Z AUTOTUNE mm(15200x512, 512x128) 2025-09-07T10:59:52.2550218Z strides: [512, 1], [1, 512] 2025-09-07T10:59:52.2550516Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:52.2551300Z triton_mm_218 0.0307 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.2552710Z triton_mm_223 0.0307 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.2553927Z triton_mm_214 0.0317 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:52.2555112Z triton_mm_219 0.0317 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.2556298Z triton_mm_215 0.0338 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:52.2557580Z triton_mm_216 0.0338 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.2558777Z triton_mm_222 0.0338 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.2559983Z triton_mm_224 0.0338 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:52.2561158Z triton_mm_213 0.0348 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:52.2562341Z triton_mm_217 0.0348 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:52.2563630Z SingleProcess AUTOTUNE benchmarking takes 0.3037 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:52.9385410Z Autotune Choices Stats: 2025-09-07T10:59:52.9387354Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.030719999223947525, "best_triton_pos": 1, "best_triton_time": 0.03481600061058998, "best_triton_kernel": "triton_mm_613", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:52.9515342Z AUTOTUNE mm(3800x1024, 1024x512) 2025-09-07T10:59:52.9515654Z strides: [1024, 1], [1, 1024] 2025-09-07T10:59:52.9515959Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:52.9516271Z mm 0.0307 ms 100.0% 2025-09-07T10:59:52.9516993Z triton_mm_613 0.0348 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.9518193Z triton_mm_612 0.0358 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.9519388Z triton_mm_610 0.0369 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.9520579Z triton_mm_617 0.0389 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.9521763Z triton_mm_607 0.0399 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:52.9523128Z triton_mm_608 0.0399 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:52.9524480Z triton_mm_611 0.0399 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:52.9525671Z triton_mm_614 0.0399 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:52.9526874Z triton_mm_616 0.0410 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:52.9527909Z SingleProcess AUTOTUNE benchmarking takes 0.3356 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:53.5296960Z Autotune Choices Stats: 2025-09-07T10:59:53.5298544Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.021503999829292297, "best_triton_pos": 1, "best_triton_time": 0.021503999829292297, "best_triton_kernel": "triton_mm_685", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:53.5427100Z AUTOTUNE mm(950x512, 512x2048) 2025-09-07T10:59:53.5427390Z strides: [512, 1], [1, 512] 2025-09-07T10:59:53.5427692Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:53.5427999Z mm 0.0215 ms 100.0% 2025-09-07T10:59:53.5428732Z triton_mm_685 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:53.5429937Z triton_mm_687 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:53.5431122Z triton_mm_688 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:53.5432313Z triton_mm_692 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:53.5433505Z triton_mm_682 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:53.5434953Z triton_mm_683 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:53.5436149Z triton_mm_686 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:53.5437338Z triton_mm_689 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:53.5438534Z triton_mm_691 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:53.5439553Z SingleProcess AUTOTUNE benchmarking takes 0.2994 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:54.8172032Z Autotune Choices Stats: 2025-09-07T10:59:54.8173583Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.023552000522613525, "best_triton_pos": 1, "best_triton_time": 0.025599999353289604, "best_triton_kernel": "triton_mm_397", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:54.8304951Z AUTOTUNE mm(3800x1024, 1024x256) 2025-09-07T10:59:54.8305270Z strides: [1024, 1], [1, 1024] 2025-09-07T10:59:54.8305888Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:54.8306185Z mm 0.0236 ms 100.0% 2025-09-07T10:59:54.8306899Z triton_mm_397 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:54.8308108Z triton_mm_403 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:54.8309305Z triton_mm_393 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:54.8310490Z triton_mm_402 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:54.8311804Z triton_mm_398 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:54.8312997Z triton_mm_394 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:54.8314174Z triton_mm_396 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:54.8315368Z triton_mm_399 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:54.8316558Z triton_mm_395 0.0307 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:54.8317588Z SingleProcess AUTOTUNE benchmarking takes 0.3079 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:55.4622017Z Autotune Choices Stats: 2025-09-07T10:59:55.4624203Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03891199827194214, "best_triton_pos": 1, "best_triton_time": 0.06758400052785873, "best_triton_kernel": "triton_convolution2d_784", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:59:55.4755211Z AUTOTUNE convolution(1x256x50x76, 256x256x3x3) 2025-09-07T10:59:55.4755609Z strides: [972800, 1, 19456, 256], [2304, 1, 768, 256] 2025-09-07T10:59:55.4755976Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:55.4756297Z convolution 0.0389 ms 100.0% 2025-09-07T10:59:55.4757177Z triton_convolution2d_784 0.0676 ms 57.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:55.4758684Z triton_convolution2d_783 0.1014 ms 38.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:55.4760191Z triton_convolution2d_786 0.1075 ms 36.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:55.4761694Z triton_convolution2d_785 0.1085 ms 35.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:55.4763407Z triton_convolution2d_781 0.1331 ms 29.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:55.4768133Z triton_convolution2d_780 0.1413 ms 27.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:55.4769651Z triton_convolution2d_782 0.2294 ms 17.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:59:55.4770833Z SingleProcess AUTOTUNE benchmarking takes 0.2112 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:59:56.6001554Z Autotune Choices Stats: 2025-09-07T10:59:56.6003281Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.023552000522613525, "best_triton_pos": 1, "best_triton_time": 0.026623999699950218, "best_triton_kernel": "triton_mm_773", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:59:56.6130847Z AUTOTUNE addmm(3800x256, 3800x1024, 1024x256) 2025-09-07T10:59:56.6131246Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T10:59:56.6131593Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:59:56.6131963Z bias_addmm 0.0236 ms 100.0% 2025-09-07T10:59:56.6132734Z triton_mm_773 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:56.6133937Z triton_mm_779 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:59:56.6135135Z triton_mm_769 0.0276 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:56.6136312Z triton_mm_778 0.0276 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:56.6137066Z addmm 0.0297 ms 79.3% 2025-09-07T10:59:56.6138207Z triton_mm_774 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:56.6139414Z triton_mm_775 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:56.6140600Z triton_mm_770 0.0307 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:56.6141792Z triton_mm_772 0.0307 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:56.6142817Z SingleProcess AUTOTUNE benchmarking takes 0.3402 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:59:57.2635017Z Autotune Choices Stats: 2025-09-07T10:59:57.2636565Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.023552000522613525, "best_triton_pos": 1, "best_triton_time": 0.02969600073993206, "best_triton_kernel": "triton_mm_659", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:59:57.2771895Z AUTOTUNE mm(950x2048, 2048x512) 2025-09-07T10:59:57.2772216Z strides: [2048, 1], [1, 2048] 2025-09-07T10:59:57.2772587Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:57.2772888Z mm 0.0236 ms 100.0% 2025-09-07T10:59:57.2773888Z triton_mm_659 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:57.2775090Z triton_mm_662 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:57.2776285Z triton_mm_658 0.0317 ms 74.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:59:57.2777473Z triton_mm_661 0.0328 ms 71.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:57.2778716Z triton_mm_655 0.0338 ms 69.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:59:57.2779993Z triton_mm_664 0.0338 ms 69.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:59:57.2781188Z triton_mm_652 0.0369 ms 63.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:59:57.2782385Z triton_mm_667 0.0420 ms 56.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:57.2783580Z triton_mm_660 0.0430 ms 54.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:59:57.2784609Z SingleProcess AUTOTUNE benchmarking takes 0.3281 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:59:57.8153562Z Autotune Choices Stats: 2025-09-07T10:59:57.8155300Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.020479999482631683, "best_triton_pos": 1, "best_triton_time": 0.0655359998345375, "best_triton_kernel": "triton_convolution2d_759", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:59:57.8287603Z AUTOTUNE convolution(1x256x25x38, 256x256x3x3) 2025-09-07T10:59:57.8288006Z strides: [243200, 1, 9728, 256], [2304, 1, 768, 256] 2025-09-07T10:59:57.8288377Z dtypes: torch.float16, torch.float16 2025-09-07T10:59:57.8288689Z convolution 0.0205 ms 100.0% 2025-09-07T10:59:57.8289578Z triton_convolution2d_759 0.0655 ms 31.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:57.8291099Z triton_convolution2d_758 0.1014 ms 20.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:57.8292605Z triton_convolution2d_761 0.1014 ms 20.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:57.8294104Z triton_convolution2d_760 0.1055 ms 19.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:59:57.8295597Z triton_convolution2d_756 0.1311 ms 15.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:57.8297095Z triton_convolution2d_755 0.1362 ms 15.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:59:57.8298772Z triton_convolution2d_757 0.2304 ms 8.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:59:57.8299958Z SingleProcess AUTOTUNE benchmarking takes 0.2197 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:00.2314324Z Autotune Choices Stats: 2025-09-07T11:00:00.2316012Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.10956799983978271, "best_triton_pos": 1, "best_triton_time": 0.43724799156188965, "best_triton_kernel": "triton_convolution2d_0", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:00.2457632Z AUTOTUNE convolution(1x3x800x1216, 64x3x7x7) 2025-09-07T11:00:00.2457995Z strides: [2918400, 1, 3648, 3], [147, 1, 21, 3] 2025-09-07T11:00:00.2458350Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:00.2458666Z convolution 0.1096 ms 100.0% 2025-09-07T11:00:00.2459561Z triton_convolution2d_0 0.4372 ms 25.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:00.2461040Z triton_convolution2d_3 0.4536 ms 24.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.2462518Z triton_convolution2d_4 0.5171 ms 21.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.2463989Z triton_convolution2d_2 0.5581 ms 19.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:00.2465718Z triton_convolution2d_5 0.5806 ms 18.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.2467200Z triton_convolution2d_1 0.7455 ms 14.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:00.2468368Z SingleProcess AUTOTUNE benchmarking takes 0.3063 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T11:00:00.4113266Z Autotune Choices Stats: 2025-09-07T11:00:00.4114919Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04608000069856644, "best_triton_pos": 1, "best_triton_time": 0.05734400078654289, "best_triton_kernel": "triton_convolution2d_28", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:00.4243978Z AUTOTUNE convolution(1x64x200x304, 64x64x3x3) 2025-09-07T11:00:00.4244427Z strides: [3891200, 1, 19456, 64], [576, 1, 192, 64] 2025-09-07T11:00:00.4244776Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:00.4245101Z convolution 0.0461 ms 100.0% 2025-09-07T11:00:00.4245987Z triton_convolution2d_28 0.0573 ms 80.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:00.4247488Z triton_convolution2d_27 0.0604 ms 76.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.4249169Z triton_convolution2d_29 0.0614 ms 75.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.4250651Z triton_convolution2d_30 0.0635 ms 72.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:00.4252121Z triton_convolution2d_25 0.0768 ms 60.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:00.4253671Z triton_convolution2d_24 0.1044 ms 44.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:00.4255159Z triton_convolution2d_26 0.1731 ms 26.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:00.4256342Z SingleProcess AUTOTUNE benchmarking takes 0.1775 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:01.1776696Z Autotune Choices Stats: 2025-09-07T11:00:01.1778413Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04915200173854828, "best_triton_pos": 1, "best_triton_time": 0.05632000043988228, "best_triton_kernel": "triton_convolution2d_179", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:01.1917271Z AUTOTUNE convolution(1x128x200x304, 128x128x3x3) 2025-09-07T11:00:01.1917679Z strides: [7782400, 1, 38912, 128], [1152, 1, 384, 128] 2025-09-07T11:00:01.1918041Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:01.1918365Z convolution 0.0492 ms 100.0% 2025-09-07T11:00:01.1919522Z triton_convolution2d_179 0.0563 ms 87.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.1921033Z triton_convolution2d_178 0.0655 ms 75.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.1922537Z triton_convolution2d_181 0.0666 ms 73.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.1924215Z triton_convolution2d_176 0.0809 ms 60.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.1925694Z triton_convolution2d_180 0.0809 ms 60.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.1927196Z triton_convolution2d_175 0.0829 ms 59.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.1928697Z triton_convolution2d_177 0.2447 ms 20.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:01.1929978Z SingleProcess AUTOTUNE benchmarking takes 0.7547 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:01.3437157Z Autotune Choices Stats: 2025-09-07T11:00:01.3438917Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03788800165057182, "best_triton_pos": 1, "best_triton_time": 0.048128001391887665, "best_triton_kernel": "triton_convolution2d_203", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T11:00:01.3573757Z AUTOTUNE convolution(1x256x200x304, 512x256x1x1) 2025-09-07T11:00:01.3574198Z strides: [15564800, 1, 77824, 256], [256, 1, 256, 256] 2025-09-07T11:00:01.3574873Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:01.3575185Z convolution 0.0379 ms 100.0% 2025-09-07T11:00:01.3576089Z triton_convolution2d_203 0.0481 ms 78.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.3577665Z triton_convolution2d_205 0.0481 ms 78.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.3579171Z triton_convolution2d_206 0.0502 ms 75.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.3580665Z triton_convolution2d_204 0.0532 ms 71.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.3582155Z triton_convolution2d_200 0.0584 ms 64.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.3583639Z triton_convolution2d_201 0.0635 ms 59.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.3585314Z triton_convolution2d_202 0.1526 ms 24.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T11:00:01.3586497Z SingleProcess AUTOTUNE benchmarking takes 0.1646 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:01.5258409Z Autotune Choices Stats: 2025-09-07T11:00:01.5260053Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.05119999870657921, "best_triton_kernel": "triton_convolution2d_229", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:01.5392547Z AUTOTUNE convolution(1x128x100x152, 128x128x3x3) 2025-09-07T11:00:01.5392943Z strides: [1945600, 1, 19456, 128], [1152, 1, 384, 128] 2025-09-07T11:00:01.5393321Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:01.5393642Z convolution 0.0420 ms 100.0% 2025-09-07T11:00:01.5394530Z triton_convolution2d_229 0.0512 ms 82.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.5396030Z triton_convolution2d_228 0.0594 ms 70.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.5397689Z triton_convolution2d_231 0.0614 ms 68.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.5399182Z triton_convolution2d_230 0.0686 ms 61.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.5400682Z triton_convolution2d_226 0.0758 ms 55.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.5402184Z triton_convolution2d_225 0.0819 ms 51.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.5403945Z triton_convolution2d_227 0.2263 ms 18.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:01.5405121Z SingleProcess AUTOTUNE benchmarking takes 0.1809 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:01.7472888Z Autotune Choices Stats: 2025-09-07T11:00:01.7474573Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04095999896526337, "best_triton_pos": 1, "best_triton_time": 0.07168000191450119, "best_triton_kernel": "triton_convolution2d_358", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:01.7610551Z AUTOTUNE convolution(1x256x100x152, 256x256x3x3) 2025-09-07T11:00:01.7610979Z strides: [3891200, 1, 38912, 256], [2304, 1, 768, 256] 2025-09-07T11:00:01.7611356Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:01.7611668Z convolution 0.0410 ms 100.0% 2025-09-07T11:00:01.7612557Z triton_convolution2d_358 0.0717 ms 57.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.7614398Z triton_convolution2d_357 0.1044 ms 39.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.7615914Z triton_convolution2d_360 0.1096 ms 37.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.7617418Z triton_convolution2d_359 0.1116 ms 36.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:01.7618978Z triton_convolution2d_355 0.1403 ms 29.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.7620473Z triton_convolution2d_354 0.1413 ms 29.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:01.7621980Z triton_convolution2d_356 0.2468 ms 16.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:01.7623153Z SingleProcess AUTOTUNE benchmarking takes 0.2125 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:01.9135092Z Autotune Choices Stats: 2025-09-07T11:00:01.9136732Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03379200026392937, "best_triton_pos": 1, "best_triton_time": 0.04198399931192398, "best_triton_kernel": "triton_convolution2d_383", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T11:00:01.9271868Z AUTOTUNE convolution(1x512x100x152, 1024x512x1x1) 2025-09-07T11:00:01.9272301Z strides: [7782400, 1, 77824, 512], [512, 1, 512, 512] 2025-09-07T11:00:01.9272668Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:01.9272977Z convolution 0.0338 ms 100.0% 2025-09-07T11:00:01.9273857Z triton_convolution2d_383 0.0420 ms 80.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.9275584Z triton_convolution2d_382 0.0461 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.9277094Z triton_convolution2d_384 0.0461 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.9278580Z triton_convolution2d_385 0.0532 ms 63.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:01.9280061Z triton_convolution2d_379 0.0594 ms 56.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.9281560Z triton_convolution2d_380 0.0696 ms 48.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:01.9283388Z triton_convolution2d_381 0.1618 ms 20.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T11:00:01.9284558Z SingleProcess AUTOTUNE benchmarking takes 0.1648 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:02.2095512Z Autotune Choices Stats: 2025-09-07T11:00:02.2097178Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04505600035190582, "best_triton_pos": 1, "best_triton_time": 0.13209599256515503, "best_triton_kernel": "triton_convolution2d_623", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:02.2237186Z AUTOTUNE convolution(1x512x50x76, 512x512x3x3) 2025-09-07T11:00:02.2237623Z strides: [1945600, 1, 38912, 512], [4608, 1, 1536, 512] 2025-09-07T11:00:02.2237998Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:02.2238317Z convolution 0.0451 ms 100.0% 2025-09-07T11:00:02.2239199Z triton_convolution2d_623 0.1321 ms 34.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.2240702Z triton_convolution2d_622 0.1925 ms 23.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.2242203Z triton_convolution2d_625 0.1987 ms 22.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.2244127Z triton_convolution2d_624 0.2068 ms 21.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.2245638Z triton_convolution2d_620 0.2662 ms 16.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.2247140Z triton_convolution2d_619 0.2744 ms 16.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.2248732Z triton_convolution2d_621 0.4659 ms 9.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:02.2249920Z SingleProcess AUTOTUNE benchmarking takes 0.2712 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:02.3827896Z Autotune Choices Stats: 2025-09-07T11:00:02.3829543Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03481600061058998, "best_triton_pos": 1, "best_triton_time": 0.04095999896526337, "best_triton_kernel": "triton_convolution2d_648", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T11:00:02.3964092Z AUTOTUNE convolution(1x1024x50x76, 2048x1024x1x1) 2025-09-07T11:00:02.3964558Z strides: [3891200, 1, 77824, 1024], [1024, 1, 1024, 1024] 2025-09-07T11:00:02.3964943Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:02.3965272Z convolution 0.0348 ms 100.0% 2025-09-07T11:00:02.3966142Z triton_convolution2d_648 0.0410 ms 85.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:02.3967947Z triton_convolution2d_647 0.0492 ms 70.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:02.3969449Z triton_convolution2d_649 0.0492 ms 70.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:02.3970941Z triton_convolution2d_650 0.0522 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T11:00:02.3972442Z triton_convolution2d_644 0.0604 ms 57.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:02.3973924Z triton_convolution2d_645 0.0666 ms 52.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T11:00:02.3975429Z triton_convolution2d_646 0.2150 ms 16.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T11:00:02.3976617Z SingleProcess AUTOTUNE benchmarking takes 0.1713 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:02.6530289Z Autotune Choices Stats: 2025-09-07T11:00:02.6531948Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.048128001391887665, "best_triton_pos": 1, "best_triton_time": 0.12800000607967377, "best_triton_kernel": "triton_convolution2d_673", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:02.6675319Z AUTOTUNE convolution(1x512x25x38, 512x512x3x3) 2025-09-07T11:00:02.6675721Z strides: [486400, 1, 19456, 512], [4608, 1, 1536, 512] 2025-09-07T11:00:02.6676080Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:02.6676403Z convolution 0.0481 ms 100.0% 2025-09-07T11:00:02.6677293Z triton_convolution2d_673 0.1280 ms 37.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.6678933Z triton_convolution2d_672 0.1894 ms 25.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.6680441Z triton_convolution2d_675 0.1976 ms 24.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.6681938Z triton_convolution2d_674 0.2038 ms 23.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:02.6683611Z triton_convolution2d_670 0.2621 ms 18.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.6685114Z triton_convolution2d_669 0.2703 ms 17.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:02.6686608Z triton_convolution2d_671 0.4598 ms 10.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:02.6687939Z SingleProcess AUTOTUNE benchmarking takes 0.2697 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:03.0100212Z Autotune Choices Stats: 2025-09-07T11:00:03.0101387Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_745", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01945599913597107, "best_triton_pos": 0} 2025-09-07T11:00:03.0242774Z AUTOTUNE addmm(950x256, 950x2048, 2048x256) 2025-09-07T11:00:03.0243259Z strides: [0, 1], [2048, 1], [1, 2048] 2025-09-07T11:00:03.0243647Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:03.0244474Z triton_mm_745 0.0195 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:03.0245240Z addmm 0.0236 ms 82.6% 2025-09-07T11:00:03.0245510Z bias_addmm 0.0246 ms 79.2% 2025-09-07T11:00:03.0246224Z triton_mm_741 0.0246 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:03.0247413Z triton_mm_738 0.0256 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:00:03.0248597Z triton_mm_744 0.0256 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:00:03.0249988Z triton_mm_748 0.0287 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:03.0251168Z triton_mm_739 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:03.0252349Z triton_mm_740 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:03.0253525Z triton_mm_747 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:03.0254614Z SingleProcess AUTOTUNE benchmarking takes 0.3501 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T11:00:14.7560360Z W0907 11:00:14.755000 127386 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T11:00:14.7574274Z W0907 11:00:14.756000 127386 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T11:00:14.7585600Z W0907 11:00:14.758000 127386 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T11:00:14.7596165Z W0907 11:00:14.759000 127386 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T11:00:14.7606706Z W0907 11:00:14.760000 127386 site-packages/torch/_inductor/utils.py:2298] [4/0_1] DeviceCopy in input program 2025-09-07T11:00:15.7967810Z Autotune Choices Stats: 2025-09-07T11:00:15.7969055Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_848", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.04198399931192398, "best_triton_pos": 0} 2025-09-07T11:00:15.8375311Z AUTOTUNE addmm(60800x3, 60800x256, 256x3) 2025-09-07T11:00:15.8375944Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:15.8376388Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:15.8377809Z triton_mm_848 0.0420 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:15.8379131Z triton_mm_854 0.0420 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:15.8380393Z triton_mm_859 0.0420 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:15.8381661Z triton_mm_851 0.0430 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:15.8382928Z triton_mm_858 0.0430 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:15.8384186Z triton_mm_847 0.0440 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:15.8385444Z triton_mm_853 0.0440 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:15.8386691Z triton_mm_852 0.0451 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:15.8387961Z triton_mm_856 0.0451 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:15.8389346Z triton_mm_855 0.0461 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:15.8390454Z SingleProcess AUTOTUNE benchmarking takes 0.3608 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T11:00:16.1528962Z Autotune Choices Stats: 2025-09-07T11:00:16.1531601Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_864", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.04095999896526337, "best_triton_pos": 0} 2025-09-07T11:00:16.1786490Z AUTOTUNE addmm(60800x12, 60800x256, 256x12) 2025-09-07T11:00:16.1787359Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:16.1788369Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:16.1789822Z triton_mm_864 0.0410 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.1791935Z triton_mm_870 0.0410 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.1793785Z triton_mm_867 0.0420 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.1795562Z triton_mm_874 0.0420 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.1797396Z triton_mm_875 0.0420 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:16.1799214Z triton_mm_861 0.0430 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:16.1801486Z triton_mm_863 0.0440 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.1803645Z triton_mm_868 0.0440 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.1805467Z triton_mm_869 0.0440 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.1806648Z bias_addmm 0.0451 ms 90.9% 2025-09-07T11:00:16.1807496Z SingleProcess AUTOTUNE benchmarking takes 0.3400 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T11:00:16.4779428Z Autotune Choices Stats: 2025-09-07T11:00:16.4780688Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_886", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T11:00:16.4913005Z AUTOTUNE addmm(15200x3, 15200x256, 256x3) 2025-09-07T11:00:16.4913358Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:16.4913717Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:16.4914523Z triton_mm_886 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.4915733Z triton_mm_887 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.4917206Z triton_mm_890 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.4918406Z triton_mm_891 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.4919593Z triton_mm_892 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.4920787Z triton_mm_893 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.4922069Z triton_mm_897 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.4923430Z triton_mm_889 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:00:16.4924620Z triton_mm_894 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.4925808Z triton_mm_895 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:16.4926840Z SingleProcess AUTOTUNE benchmarking takes 0.3079 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T11:00:16.7512251Z Autotune Choices Stats: 2025-09-07T11:00:16.7513422Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_900", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T11:00:16.7646122Z AUTOTUNE addmm(15200x12, 15200x256, 256x12) 2025-09-07T11:00:16.7646742Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:16.7647794Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:16.7649320Z triton_mm_900 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:16.7651541Z triton_mm_906 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.7653757Z triton_mm_909 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.7655961Z triton_mm_901 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:16.7658224Z triton_mm_902 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.7660389Z triton_mm_903 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:16.7662570Z triton_mm_905 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:00:16.7664681Z triton_mm_907 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.7667074Z triton_mm_908 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:16.7669259Z triton_mm_910 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:16.7671137Z SingleProcess AUTOTUNE benchmarking takes 0.2726 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T11:00:17.0193735Z Autotune Choices Stats: 2025-09-07T11:00:17.0195861Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_925", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T11:00:17.0333711Z AUTOTUNE addmm(3800x3, 3800x256, 256x3) 2025-09-07T11:00:17.0334307Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:17.0334924Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:17.0336402Z triton_mm_925 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.0338547Z triton_mm_926 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.0340586Z triton_mm_929 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.0342624Z triton_mm_931 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.0344704Z triton_mm_932 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.0347088Z triton_mm_923 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:17.0349133Z triton_mm_924 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:17.0351155Z triton_mm_930 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.0353103Z triton_mm_934 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:17.0355221Z triton_mm_936 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.0357107Z SingleProcess AUTOTUNE benchmarking takes 0.2642 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:17.2843110Z Autotune Choices Stats: 2025-09-07T11:00:17.2983133Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_939", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T11:00:17.2985364Z AUTOTUNE addmm(3800x12, 3800x256, 256x12) 2025-09-07T11:00:17.2985954Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:17.2986786Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:17.2988264Z triton_mm_939 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:17.2990473Z triton_mm_940 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:17.2992673Z triton_mm_941 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.2994868Z triton_mm_942 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.2996987Z triton_mm_945 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.2998991Z triton_mm_947 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.3001002Z triton_mm_948 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.3003220Z triton_mm_952 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.3005174Z triton_mm_946 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.3007360Z triton_mm_949 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.3009252Z SingleProcess AUTOTUNE benchmarking takes 0.2642 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:17.5529790Z Autotune Choices Stats: 2025-09-07T11:00:17.5531399Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_962", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T11:00:17.5670033Z AUTOTUNE addmm(950x3, 950x256, 256x3) 2025-09-07T11:00:17.5670362Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:17.5670716Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:17.5671551Z triton_mm_962 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:17.5672761Z triton_mm_963 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:17.5673958Z triton_mm_964 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.5675145Z triton_mm_965 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.5676342Z triton_mm_968 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.5677528Z triton_mm_970 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.5678851Z triton_mm_971 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.5680063Z triton_mm_976 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:17.5681250Z triton_mm_969 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.5682434Z triton_mm_973 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:17.5683717Z SingleProcess AUTOTUNE benchmarking takes 0.2643 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:17.8186129Z Autotune Choices Stats: 2025-09-07T11:00:17.8187287Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_979", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T11:00:17.8326257Z AUTOTUNE addmm(950x12, 950x256, 256x12) 2025-09-07T11:00:17.8326624Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:17.8326979Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:17.8327773Z triton_mm_979 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:17.8328968Z triton_mm_978 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:17.8330155Z triton_mm_980 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.8331339Z triton_mm_981 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:17.8332730Z triton_mm_984 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.8333923Z triton_mm_986 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.8335107Z triton_mm_987 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:17.8336280Z triton_mm_991 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:17.8337546Z triton_mm_992 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:17.8338312Z bias_addmm 0.0102 ms 80.0% 2025-09-07T11:00:17.8338868Z SingleProcess AUTOTUNE benchmarking takes 0.2651 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:18.0144745Z Autotune Choices Stats: 2025-09-07T11:00:18.0146357Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.018432000651955605, "best_triton_pos": 1, "best_triton_time": 0.0655359998345375, "best_triton_kernel": "triton_convolution2d_997", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:00:18.0280783Z AUTOTUNE convolution(1x256x13x19, 256x256x3x3) 2025-09-07T11:00:18.0281189Z strides: [63232, 1, 4864, 256], [2304, 1, 768, 256] 2025-09-07T11:00:18.0281549Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:18.0281869Z convolution 0.0184 ms 100.0% 2025-09-07T11:00:18.0282770Z triton_convolution2d_997 0.0655 ms 28.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:18.0284441Z triton_convolution2d_996 0.1034 ms 17.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:18.0286066Z triton_convolution2d_999 0.1055 ms 17.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:18.0287569Z triton_convolution2d_998 0.1065 ms 17.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:00:18.0289069Z triton_convolution2d_995 0.1116 ms 16.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:00:18.0290569Z triton_convolution2d_993 0.1352 ms 13.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:18.0292062Z triton_convolution2d_994 0.1362 ms 13.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:00:18.0293238Z SingleProcess AUTOTUNE benchmarking takes 0.1949 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:00:18.2788430Z Autotune Choices Stats: 2025-09-07T11:00:18.2789839Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_1001", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T11:00:18.2929412Z AUTOTUNE addmm(247x3, 247x256, 256x3) 2025-09-07T11:00:18.2929788Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:18.2930149Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:18.2930989Z triton_mm_1001 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:18.2932216Z triton_mm_1002 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:18.2933440Z triton_mm_1010 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:18.2934635Z triton_mm_1003 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:18.2935828Z triton_mm_1004 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:18.2937026Z triton_mm_1007 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:18.2943516Z triton_mm_1009 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:18.2944715Z triton_mm_1014 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:18.2945914Z triton_mm_1015 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:18.2947105Z triton_mm_1008 0.0102 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:18.2948299Z SingleProcess AUTOTUNE benchmarking takes 0.2639 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:18.5439509Z Autotune Choices Stats: 2025-09-07T11:00:18.5440679Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_1018", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T11:00:18.5583191Z AUTOTUNE addmm(247x12, 247x256, 256x12) 2025-09-07T11:00:18.5583520Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:00:18.5583879Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:18.5584698Z triton_mm_1018 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T11:00:18.5585905Z triton_mm_1019 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:18.5587111Z triton_mm_1020 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:18.5588548Z triton_mm_1023 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:18.5589746Z triton_mm_1026 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:18.5590959Z triton_mm_1017 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T11:00:18.5592155Z triton_mm_1025 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:00:18.5593356Z triton_mm_1030 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:18.5594553Z triton_mm_1031 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:18.5595328Z bias_addmm 0.0102 ms 80.0% 2025-09-07T11:00:18.5595878Z SingleProcess AUTOTUNE benchmarking takes 0.2648 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:00:18.5636058Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:00:18.5636828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:00:18.5637550Z anchors = self.anchor_generator(images, features) 2025-09-07T11:00:18.5638414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:00:18.5639160Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:00:18.5639958Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:00:18.5640905Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5641812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:00:18.5642715Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5643305Z 2025-09-07T11:00:18.5643597Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:00:18.5644318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:00:18.5645034Z anchors = self.anchor_generator(images, features) 2025-09-07T11:00:18.5645779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:00:18.5646537Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:00:18.5647328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:00:18.5648245Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5649143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:00:18.5650049Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5650453Z 2025-09-07T11:00:18.5650650Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:00:18.5651383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:00:18.5652080Z anchors = self.anchor_generator(images, features) 2025-09-07T11:00:18.5652828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:00:18.5653700Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:00:18.5654502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:00:18.5655435Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5656350Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:00:18.5657244Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5657741Z 2025-09-07T11:00:18.5657923Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:00:18.5658664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:00:18.5659378Z anchors = self.anchor_generator(images, features) 2025-09-07T11:00:18.5660134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:00:18.5660881Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:00:18.5661675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:00:18.5662612Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5663515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:00:18.5664483Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5664891Z 2025-09-07T11:00:18.5665071Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:00:18.5665811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:00:18.5666530Z anchors = self.anchor_generator(images, features) 2025-09-07T11:00:18.5667277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:00:18.5668033Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:00:18.5668813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:00:18.5669797Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5670702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:00:18.5671601Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:00:18.5672008Z 2025-09-07T11:00:18.5834241Z cudagraph partition into 2 partitions 2025-09-07T11:00:22.5711709Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T11:00:22.5712867Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T11:00:22.5713758Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T11:00:22.5714636Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T11:00:22.5715686Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T11:00:22.5716562Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:00:22.5717826Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T11:00:22.5719361Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T11:00:22.5721134Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T11:00:22.5722740Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T11:00:22.5724423Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T11:00:22.5725828Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T11:00:22.5727140Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T11:00:22.5728480Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T11:00:22.5730024Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T11:00:22.5731428Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T11:00:22.5732850Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T11:00:22.5734285Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T11:00:22.5735771Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T11:00:22.5736683Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:00:22.5737366Z W0907 11:00:22.570000 127386 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:00:22.8621047Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:00:22.8621945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T11:00:22.8622855Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T11:00:22.8623218Z 2025-09-07T11:00:23.1228162Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:00:23.1229038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T11:00:23.1229958Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T11:00:23.1230323Z 2025-09-07T11:00:30.9378278Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:00:30.9379770Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:00:30.9381090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:00:30.9381798Z return fn(*args[2:], **kwargs) 2025-09-07T11:00:30.9382418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:00:30.9383268Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:00:30.9384151Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:00:30.9384856Z v4 = masked_index(y_high, x_high) 2025-09-07T11:00:30.9385486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:00:30.9386117Z return input[ 2025-09-07T11:00:30.9386261Z 2025-09-07T11:00:30.9386266Z 2025-09-07T11:00:35.1034023Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:00:35.1035513Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:00:35.1036458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:00:35.1037436Z return fn(*args[2:], **kwargs) 2025-09-07T11:00:35.1038061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:00:35.1038911Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:00:35.1039812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:00:35.1040506Z v4 = masked_index(y_high, x_high) 2025-09-07T11:00:35.1041138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:00:35.1041765Z return input[ 2025-09-07T11:00:35.1041908Z 2025-09-07T11:00:35.1041913Z 2025-09-07T11:00:39.2151974Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:00:39.2153723Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:00:39.2154671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:00:39.2155357Z return fn(*args[2:], **kwargs) 2025-09-07T11:00:39.2155999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:00:39.2156839Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:00:39.2157731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:00:39.2158426Z v4 = masked_index(y_high, x_high) 2025-09-07T11:00:39.2159049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:00:39.2159685Z return input[ 2025-09-07T11:00:39.2159842Z 2025-09-07T11:00:39.2159847Z 2025-09-07T11:00:42.4509051Z Autotune Choices Stats: 2025-09-07T11:00:42.4510933Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.13926400244235992, "best_triton_pos": 1, "best_triton_time": 0.2088959962129593, "best_triton_kernel": "triton_mm_1043", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:00:42.4667839Z AUTOTUNE mm(1000x12544, 12544x1024) 2025-09-07T11:00:42.4668230Z strides: [12544, 1], [1, 12544] 2025-09-07T11:00:42.4668538Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:42.4668838Z mm 0.1393 ms 100.0% 2025-09-07T11:00:42.4669579Z triton_mm_1043 0.2089 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:42.4670831Z triton_mm_1048 0.2109 ms 66.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:42.4672048Z triton_mm_1049 0.2284 ms 61.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:42.4673262Z triton_mm_1039 0.2294 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:00:42.4674460Z triton_mm_1040 0.2345 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:42.4675673Z triton_mm_1044 0.2478 ms 56.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:42.4679051Z triton_mm_1047 0.2570 ms 54.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:42.4680256Z triton_mm_1042 0.2662 ms 52.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:42.4681464Z triton_mm_1041 0.2703 ms 51.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:42.4682500Z SingleProcess AUTOTUNE benchmarking takes 0.7207 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:00:43.1577579Z Autotune Choices Stats: 2025-09-07T11:00:43.1579677Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.023552000522613525, "best_triton_pos": 1, "best_triton_time": 0.02457600086927414, "best_triton_kernel": "triton_mm_1061", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:00:43.1727633Z AUTOTUNE mm(1000x1024, 1024x1024) 2025-09-07T11:00:43.1728020Z strides: [1024, 1], [1, 1024] 2025-09-07T11:00:43.1728326Z dtypes: torch.float16, torch.float16 2025-09-07T11:00:43.1728664Z mm 0.0236 ms 100.0% 2025-09-07T11:00:43.1729383Z triton_mm_1061 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:43.1730594Z triton_mm_1067 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:43.1731810Z triton_mm_1066 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:43.1733017Z triton_mm_1057 0.0276 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:00:43.1734608Z triton_mm_1062 0.0276 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:43.1735815Z triton_mm_1059 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:43.1737012Z triton_mm_1060 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:43.1738301Z triton_mm_1063 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:43.1739506Z triton_mm_1065 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:43.1740546Z SingleProcess AUTOTUNE benchmarking takes 0.3263 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:00:44.0070259Z Autotune Choices Stats: 2025-09-07T11:00:44.0071502Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1072", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T11:00:44.0227423Z AUTOTUNE addmm(1000x91, 1000x1024, 1024x91) 2025-09-07T11:00:44.0227815Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T11:00:44.0228548Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:44.0229397Z triton_mm_1072 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:44.0230634Z triton_mm_1076 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:44.0231841Z triton_mm_1069 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:00:44.0233043Z triton_mm_1070 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:44.0234331Z triton_mm_1071 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:44.0235529Z triton_mm_1075 0.0184 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:00:44.0236728Z triton_mm_1078 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:44.0237904Z triton_mm_1079 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:44.0239099Z triton_mm_1081 0.0225 ms 59.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:44.0240297Z triton_mm_1077 0.0276 ms 48.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:44.0241331Z SingleProcess AUTOTUNE benchmarking takes 0.3268 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T11:00:44.3339658Z Autotune Choices Stats: 2025-09-07T11:00:44.3341283Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1094", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T11:00:44.3495714Z AUTOTUNE addmm(1000x364, 1000x1024, 1024x364) 2025-09-07T11:00:44.3496104Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T11:00:44.3496454Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:00:44.3497304Z triton_mm_1094 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:44.3498572Z triton_mm_1093 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:00:44.3499778Z triton_mm_1090 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:00:44.3500975Z triton_mm_1097 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:44.3502169Z triton_mm_1087 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:00:44.3503357Z triton_mm_1088 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:44.3504773Z triton_mm_1089 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:00:44.3505965Z triton_mm_1096 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:44.3507159Z triton_mm_1099 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:00:44.3508355Z triton_mm_1098 0.0256 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:00:44.3509483Z SingleProcess AUTOTUNE benchmarking takes 0.3261 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T11:00:57.0781901Z W0907 11:00:57.076000 127386 site-packages/torch/fx/experimental/symbolic_shapes.py:2396] [30/4] RecursionError in sympy.xreplace(Eq(Mod(2*s29, s93), 0), {s29: evaluate_static_shape_0 + 1, s93: evaluate_static_shape_1 + 1}) 2025-09-07T11:00:58.2104292Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:00:58.2105773Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:00:58.2106699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:00:58.2107411Z return fn(*args[2:], **kwargs) 2025-09-07T11:00:58.2108027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:00:58.2108869Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:00:58.2109761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:00:58.2110442Z v4 = masked_index(y_high, x_high) 2025-09-07T11:00:58.2111437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:00:58.2112070Z return input[ 2025-09-07T11:00:58.2112214Z 2025-09-07T11:00:58.2112218Z 2025-09-07T11:01:04.7702726Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:01:04.7705536Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:01:04.7707187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:01:04.7708415Z return fn(*args[2:], **kwargs) 2025-09-07T11:01:04.7709520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:01:04.7711039Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:01:04.7712532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:01:04.7713730Z v4 = masked_index(y_high, x_high) 2025-09-07T11:01:04.7714911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:01:04.7716038Z return input[ 2025-09-07T11:01:04.7716776Z 2025-09-07T11:01:04.7716782Z 2025-09-07T11:01:13.6119297Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:01:13.6120776Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:01:13.6121729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:01:13.6122435Z return fn(*args[2:], **kwargs) 2025-09-07T11:01:13.6123241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:01:13.6124092Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:01:13.6125270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:01:13.6125959Z v4 = masked_index(y_high, x_high) 2025-09-07T11:01:13.6126595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:01:13.6127217Z return input[ 2025-09-07T11:01:13.6127359Z 2025-09-07T11:01:13.6127364Z 2025-09-07T11:01:17.1351108Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:01:17.1352682Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:01:17.1353603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:01:17.1354306Z return fn(*args[2:], **kwargs) 2025-09-07T11:01:17.1354935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:01:17.1355780Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:01:17.1356675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:01:17.1357720Z v4 = masked_index(y_high, x_high) 2025-09-07T11:01:17.1358362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:01:17.1358993Z return input[ 2025-09-07T11:01:17.1359138Z 2025-09-07T11:01:17.1359143Z 2025-09-07T11:01:18.3610021Z Autotune Choices Stats: 2025-09-07T11:01:18.3612211Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.06656000018119812, "best_triton_kernel": "triton_convolution2d_1108", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:01:18.3773203Z AUTOTUNE convolution(4x256x14x14, 256x256x3x3) 2025-09-07T11:01:18.3774179Z strides: [50176, 1, 3584, 256], [2304, 1, 768, 256] 2025-09-07T11:01:18.3774973Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:18.3775737Z convolution 0.0195 ms 100.0% 2025-09-07T11:01:18.3776729Z triton_convolution2d_1108 0.0666 ms 29.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:01:18.3778421Z triton_convolution2d_1107 0.0983 ms 19.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:01:18.3780288Z triton_convolution2d_1109 0.1055 ms 18.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:01:18.3781815Z triton_convolution2d_1110 0.1065 ms 18.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:01:18.3783342Z triton_convolution2d_1105 0.1260 ms 15.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:01:18.3784837Z triton_convolution2d_1104 0.1341 ms 14.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:01:18.3786470Z triton_convolution2d_1106 0.2140 ms 9.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:01:18.3787673Z SingleProcess AUTOTUNE benchmarking takes 0.2199 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:01:19.8171847Z Autotune Choices Stats: 2025-09-07T11:01:19.8173108Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1133", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T11:01:19.8327087Z AUTOTUNE addmm(3136x91, 3136x256, 256x91) 2025-09-07T11:01:19.8327507Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:01:19.8327852Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:01:19.8328702Z triton_mm_1133 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:19.8329915Z triton_mm_1139 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:19.8331477Z triton_mm_1134 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:19.8332670Z triton_mm_1135 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:19.8333864Z triton_mm_1140 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:19.8335055Z triton_mm_1142 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:19.8336256Z triton_mm_1136 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:19.8337537Z triton_mm_1143 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:19.8338735Z triton_mm_1145 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:19.8339931Z triton_mm_1141 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:19.8341059Z SingleProcess AUTOTUNE benchmarking takes 0.3027 seconds and 1.1377 seconds precompiling for 20 choices 2025-09-07T11:01:25.3715199Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T11:01:25.3716737Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T11:01:25.3717114Z 2025-09-07T11:01:25.3717119Z 2025-09-07T11:01:26.3849187Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 418, in torch_dynamo_resume_in_paste_mask_in_image_at_410 2025-09-07T11:01:26.3850982Z mask = F.interpolate(mask, size=(h, w), mode="bilinear", align_corners=False) 2025-09-07T11:01:26.3851351Z 2025-09-07T11:01:26.3851363Z 2025-09-07T11:01:27.9462183Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 823, in torch_dynamo_resume_in_forward_at_806 2025-09-07T11:01:27.9463568Z masks_probs = maskrcnn_inference(mask_logits, labels) 2025-09-07T11:01:27.9464406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 79, in maskrcnn_inference 2025-09-07T11:01:27.9472497Z mask_prob = mask_prob[index, labels][:, None] 2025-09-07T11:01:27.9472792Z 2025-09-07T11:01:27.9472797Z 2025-09-07T11:01:35.1316976Z Autotune Choices Stats: 2025-09-07T11:01:35.1318492Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.11468800157308578, "best_triton_pos": 1, "best_triton_time": 0.1515520066022873, "best_triton_kernel": "triton_mm_1291", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:01:35.1467504Z AUTOTUNE mm(1024x1000, 1000x12544) 2025-09-07T11:01:35.1467836Z strides: [1, 1024], [12544, 1] 2025-09-07T11:01:35.1468143Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:35.1468458Z mm 0.1147 ms 100.0% 2025-09-07T11:01:35.1469546Z triton_mm_1291 0.1516 ms 75.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.1470780Z triton_mm_1292 0.1567 ms 73.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.1472002Z triton_mm_1285 0.1577 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.1473216Z triton_mm_1287 0.1649 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.1474431Z triton_mm_1288 0.1669 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.1475648Z triton_mm_1293 0.1956 ms 58.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:35.1476862Z triton_mm_1286 0.1976 ms 58.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:35.1478051Z triton_mm_1282 0.2048 ms 56.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:35.1479366Z triton_mm_1289 0.2048 ms 56.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:35.1480405Z SingleProcess AUTOTUNE benchmarking takes 0.6386 seconds and 3.0384 seconds precompiling for 19 choices 2025-09-07T11:01:35.6730781Z Autotune Choices Stats: 2025-09-07T11:01:35.6732267Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.02252800017595291, "best_triton_pos": 1, "best_triton_time": 0.025599999353289604, "best_triton_kernel": "triton_mm_1256", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:01:35.6875359Z AUTOTUNE mm(1024x1000, 1000x1024) 2025-09-07T11:01:35.6875675Z strides: [1, 1024], [1024, 1] 2025-09-07T11:01:35.6875981Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:35.6876293Z mm 0.0225 ms 100.0% 2025-09-07T11:01:35.6877021Z triton_mm_1256 0.0256 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.6878258Z triton_mm_1251 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.6879466Z triton_mm_1257 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:35.6880665Z triton_mm_1247 0.0276 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:35.6881871Z triton_mm_1252 0.0276 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.6883288Z triton_mm_1248 0.0287 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:35.6884729Z triton_mm_1249 0.0287 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.6885952Z triton_mm_1255 0.0297 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.6887169Z triton_mm_1250 0.0317 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:35.6888199Z SingleProcess AUTOTUNE benchmarking takes 0.3020 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:35.9692226Z Autotune Choices Stats: 2025-09-07T11:01:35.9693462Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1210", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T11:01:35.9835984Z AUTOTUNE mm(1000x91, 91x1024) 2025-09-07T11:01:35.9836335Z strides: [91, 1], [1024, 1] 2025-09-07T11:01:35.9836652Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:35.9837439Z triton_mm_1210 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:35.9838688Z triton_mm_1212 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:35.9840148Z triton_mm_1213 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.9841367Z triton_mm_1214 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:35.9842587Z triton_mm_1216 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.9843997Z triton_mm_1206 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:35.9845303Z triton_mm_1209 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:35.9846502Z triton_mm_1211 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:35.9847711Z triton_mm_1215 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:35.9848912Z triton_mm_1217 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:35.9849957Z SingleProcess AUTOTUNE benchmarking takes 0.2721 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:36.8433207Z Autotune Choices Stats: 2025-09-07T11:01:36.8434756Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.021503999829292297, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_mm_1233", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:01:36.8576471Z AUTOTUNE mm(1000x1024, 1024x1024) 2025-09-07T11:01:36.8577153Z strides: [1024, 1], [1024, 1] 2025-09-07T11:01:36.8577517Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:36.8577828Z mm 0.0215 ms 100.0% 2025-09-07T11:01:36.8578547Z triton_mm_1233 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:36.8579766Z triton_mm_1239 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:36.8580970Z triton_mm_1229 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:36.8582150Z triton_mm_1231 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:36.8583357Z triton_mm_1238 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:36.8584556Z triton_mm_1232 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:36.8585756Z triton_mm_1234 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:36.8587063Z triton_mm_1235 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:36.8588280Z triton_mm_1237 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:36.8589327Z SingleProcess AUTOTUNE benchmarking takes 0.2947 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:37.3684827Z Autotune Choices Stats: 2025-09-07T11:01:37.3686058Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1157", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.018432000651955605, "best_triton_pos": 0} 2025-09-07T11:01:37.3830022Z AUTOTUNE mm(364x1000, 1000x1024) 2025-09-07T11:01:37.3830343Z strides: [1, 364], [1024, 1] 2025-09-07T11:01:37.3830638Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:37.3831397Z triton_mm_1157 0.0184 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:37.3832630Z triton_mm_1152 0.0205 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:37.3833384Z mm 0.0215 ms 85.7% 2025-09-07T11:01:37.3834082Z triton_mm_1158 0.0215 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:37.3835291Z triton_mm_1161 0.0215 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:37.3836497Z triton_mm_1160 0.0225 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:37.3837681Z triton_mm_1153 0.0236 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:37.3839077Z triton_mm_1163 0.0236 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:37.3840292Z triton_mm_1151 0.0256 ms 72.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:37.3841497Z triton_mm_1159 0.0256 ms 72.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:37.3842541Z SingleProcess AUTOTUNE benchmarking takes 0.2888 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:37.8935743Z Autotune Choices Stats: 2025-09-07T11:01:37.8937360Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.02457600086927414, "best_triton_kernel": "triton_mm_1169", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4"} 2025-09-07T11:01:37.9084704Z AUTOTUNE mm(91x1000, 1000x1024) 2025-09-07T11:01:37.9085029Z strides: [1, 91], [1024, 1] 2025-09-07T11:01:37.9085328Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:37.9085635Z mm 0.0195 ms 100.0% 2025-09-07T11:01:37.9086356Z triton_mm_1169 0.0246 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:37.9087852Z triton_mm_1172 0.0276 ms 70.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:37.9089075Z triton_mm_1170 0.0287 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:37.9090274Z triton_mm_1175 0.0307 ms 63.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:37.9091475Z triton_mm_1179 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:37.9092761Z triton_mm_1171 0.0358 ms 54.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:37.9093958Z triton_mm_1174 0.0389 ms 50.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:37.9095166Z triton_mm_1176 0.0410 ms 47.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:01:37.9096391Z triton_mm_1177 0.0420 ms 46.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:37.9097529Z SingleProcess AUTOTUNE benchmarking takes 0.3122 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:38.7410594Z Autotune Choices Stats: 2025-09-07T11:01:38.7411861Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1195", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T11:01:38.7555041Z AUTOTUNE mm(1000x364, 364x1024) 2025-09-07T11:01:38.7555388Z strides: [364, 1], [1024, 1] 2025-09-07T11:01:38.7555678Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:38.7556834Z triton_mm_1195 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:38.7558080Z triton_mm_1197 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:38.7559308Z triton_mm_1196 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:38.7560536Z triton_mm_1201 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:38.7561758Z triton_mm_1192 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:01:38.7563155Z triton_mm_1198 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:38.7564369Z triton_mm_1193 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:01:38.7565571Z triton_mm_1199 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:38.7566885Z triton_mm_1203 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:38.7567647Z mm 0.0205 ms 80.0% 2025-09-07T11:01:38.7568185Z SingleProcess AUTOTUNE benchmarking takes 0.2702 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:39.3643794Z Autotune Choices Stats: 2025-09-07T11:01:39.3645280Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.12185599654912949, "best_triton_pos": 1, "best_triton_time": 0.14643199741840363, "best_triton_kernel": "triton_mm_1274", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:01:39.3788489Z AUTOTUNE mm(1000x1024, 1024x12544) 2025-09-07T11:01:39.3788822Z strides: [1024, 1], [12544, 1] 2025-09-07T11:01:39.3789126Z dtypes: torch.float16, torch.float16 2025-09-07T11:01:39.3789422Z mm 0.1219 ms 100.0% 2025-09-07T11:01:39.3790146Z triton_mm_1274 0.1464 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:39.3791375Z triton_mm_1273 0.1495 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:39.3792584Z triton_mm_1269 0.1577 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:39.3793781Z triton_mm_1267 0.1587 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:39.3794983Z triton_mm_1270 0.1659 ms 73.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:01:39.3796181Z triton_mm_1275 0.1669 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:01:39.3797581Z triton_mm_1268 0.1812 ms 67.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:39.3798792Z triton_mm_1271 0.1976 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:01:39.3800007Z triton_mm_1272 0.2017 ms 60.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T11:01:39.3801047Z SingleProcess AUTOTUNE benchmarking takes 0.6225 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T11:01:41.4560650Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 256, in torch_dynamo_resume_in_roi_align_at_255 2025-09-07T11:01:41.4562141Z return _roi_align(input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned) 2025-09-07T11:01:41.4563305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/polyfills/__init__.py", line 264, in getattr_and_trace 2025-09-07T11:01:41.4564014Z return fn(*args[2:], **kwargs) 2025-09-07T11:01:41.4564657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:01:41.4565768Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:01:41.4566651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:01:41.4567352Z v4 = masked_index(y_high, x_high) 2025-09-07T11:01:41.4567989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:01:41.4568625Z return input[ 2025-09-07T11:01:41.4568769Z 2025-09-07T11:01:41.4568774Z 2025-09-07T11:01:46.3406653Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 114, in forward 2025-09-07T11:01:46.3408254Z features = self.backbone(images.tensors) 2025-09-07T11:01:46.3408998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/backbone_utils.py", line 58, in forward 2025-09-07T11:01:46.3409712Z x = self.fpn(x) 2025-09-07T11:01:46.3410343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/feature_pyramid_network.py", line 194, in forward 2025-09-07T11:01:46.3411162Z inner_top_down = F.interpolate(last_inner, size=feat_shape, mode="nearest") 2025-09-07T11:01:46.3411533Z 2025-09-07T11:01:46.3411538Z 2025-09-07T11:01:47.2917192Z W0907 11:01:47.290000 127386 site-packages/torch/_logging/_internal.py:1199] [51/0] Profiler function will be ignored 2025-09-07T11:01:59.3019091Z W0907 11:01:59.301000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T11:01:59.3021608Z W0907 11:01:59.301000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] function: 'torch_dynamo_resume_in_roi_align_at_255' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:255) 2025-09-07T11:01:59.3024369Z W0907 11:01:59.301000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] last reason: 30/7: tensor 'rois' requires_grad mismatch. expected requires_grad=1 2025-09-07T11:01:59.3026517Z W0907 11:01:59.301000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T11:01:59.3029527Z W0907 11:01:59.301000 127386 site-packages/torch/_dynamo/convert_frame.py:1358] [30/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T11:02:01.8757148Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:01.8759447Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:01.8761038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:01.8762264Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:01.8763675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:01.8764757Z return input[ 2025-09-07T11:02:01.8765011Z 2025-09-07T11:02:01.8765018Z 2025-09-07T11:02:04.5021777Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:04.5023103Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:04.5024002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:04.5024990Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:04.5025614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:04.5026241Z return input[ 2025-09-07T11:02:04.5026388Z 2025-09-07T11:02:04.5026408Z 2025-09-07T11:02:06.7855348Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:06.7856665Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:06.7857629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:06.7858336Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:06.7859249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:06.7859895Z return input[ 2025-09-07T11:02:06.7860039Z 2025-09-07T11:02:06.7860044Z 2025-09-07T11:02:09.1139425Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:09.1140774Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:09.1141685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:09.1142381Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:09.1143004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:09.1143638Z return input[ 2025-09-07T11:02:09.1143792Z 2025-09-07T11:02:09.1143797Z 2025-09-07T11:02:15.0842615Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:15.0844176Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:15.0845078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:15.0846185Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:15.0846839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:15.0847454Z return input[ 2025-09-07T11:02:15.0847613Z 2025-09-07T11:02:15.0847618Z 2025-09-07T11:02:19.8342730Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:19.8345014Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:19.8346484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:19.8347713Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:19.8348848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:19.8349867Z return input[ 2025-09-07T11:02:19.8350106Z 2025-09-07T11:02:19.8350114Z 2025-09-07T11:02:23.4658674Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 185, in _roi_align 2025-09-07T11:02:23.4661106Z val = _bilinear_interpolate(input, roi_batch_ind, y, x, ymask, xmask) # [K, C, PH, PW, IY, IX] 2025-09-07T11:02:23.4663106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 84, in _bilinear_interpolate 2025-09-07T11:02:23.4664356Z v4 = masked_index(y_high, x_high) 2025-09-07T11:02:23.4665475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py", line 74, in masked_index 2025-09-07T11:02:23.4666583Z return input[ 2025-09-07T11:02:23.4666851Z 2025-09-07T11:02:23.4666861Z 2025-09-07T11:02:35.9048463Z W0907 11:02:35.904000 127386 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T11:02:35.9337669Z W0907 11:02:35.933000 127386 site-packages/torch/_dynamo/utils.py:3060] Found nan in reference. Consider running in higher precision. 2025-09-07T11:02:36.0302085Z pass 2025-09-07T11:02:42.3332904Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T11:02:42.3340286Z import pynvml # type: ignore[import] 2025-09-07T11:02:45.0032715Z 2025-09-07T11:02:48.1287023Z loading model: 0it [00:00, ?it/s] 2025-09-07T11:02:48.1287408Z loading model: 0it [00:03, ?it/s] 2025-09-07T11:02:48.1287734Z cuda train yolov3 2025-09-07T11:03:27.6422064Z Autotune Choices Stats: 2025-09-07T11:03:27.6423793Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.0798719972372055, "best_triton_pos": 1, "best_triton_time": 0.19763199985027313, "best_triton_kernel": "triton_convolution2d_2", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8"} 2025-09-07T11:03:27.6577031Z AUTOTUNE convolution(4x3x384x512, 32x3x3x3) 2025-09-07T11:03:27.6577471Z strides: [589824, 1, 1536, 3], [27, 1, 9, 3] 2025-09-07T11:03:27.6577819Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:27.6578131Z convolution 0.0799 ms 100.0% 2025-09-07T11:03:27.6579027Z triton_convolution2d_2 0.1976 ms 40.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:27.6580908Z triton_convolution2d_4 0.2273 ms 35.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.6582393Z triton_convolution2d_0 0.2417 ms 33.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:27.6583879Z triton_convolution2d_3 0.2632 ms 30.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.6585363Z triton_convolution2d_1 0.3287 ms 24.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:27.6586831Z triton_convolution2d_5 0.3645 ms 21.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.6587997Z SingleProcess AUTOTUNE benchmarking takes 0.2767 seconds and 0.0003 seconds precompiling for 7 choices 2025-09-07T11:03:27.8694564Z Autotune Choices Stats: 2025-09-07T11:03:27.8696209Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.09113600105047226, "best_triton_pos": 1, "best_triton_time": 0.10342399775981903, "best_triton_kernel": "triton_convolution2d_9", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:27.8842127Z AUTOTUNE convolution(4x32x384x512, 64x32x3x3) 2025-09-07T11:03:27.8842514Z strides: [6291456, 1, 16384, 32], [288, 1, 96, 32] 2025-09-07T11:03:27.8843102Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:27.8843418Z convolution 0.0911 ms 100.0% 2025-09-07T11:03:27.8844303Z triton_convolution2d_9 0.1034 ms 88.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.8845787Z triton_convolution2d_10 0.1055 ms 86.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:27.8847410Z triton_convolution2d_11 0.1085 ms 84.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.8848895Z triton_convolution2d_7 0.1147 ms 79.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:27.8850383Z triton_convolution2d_12 0.1167 ms 78.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:27.8851857Z triton_convolution2d_6 0.2007 ms 45.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:27.8853330Z triton_convolution2d_8 0.3297 ms 27.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:27.8854502Z SingleProcess AUTOTUNE benchmarking takes 0.2249 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:28.1668730Z Autotune Choices Stats: 2025-09-07T11:03:28.1670216Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_14", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.03686400130391121, "best_triton_pos": 0} 2025-09-07T11:03:28.1815794Z AUTOTUNE mm(196608x64, 64x32) 2025-09-07T11:03:28.1816121Z strides: [64, 1], [1, 64] 2025-09-07T11:03:28.1816415Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:28.1817178Z triton_mm_14 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.1818425Z triton_mm_17 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:28.1819624Z triton_mm_23 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.1820803Z triton_mm_24 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:03:28.1821989Z triton_mm_29 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:28.1823179Z triton_mm_20 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:28.1824545Z triton_mm_28 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.1825285Z mm 0.0399 ms 92.3% 2025-09-07T11:03:28.1825959Z triton_mm_21 0.0410 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.1827125Z triton_mm_19 0.0420 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.1828142Z SingleProcess AUTOTUNE benchmarking takes 0.2958 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T11:03:28.3780940Z Autotune Choices Stats: 2025-09-07T11:03:28.3782599Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.0655359998345375, "best_triton_pos": 1, "best_triton_time": 0.08191999793052673, "best_triton_kernel": "triton_convolution2d_33", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:28.3926645Z AUTOTUNE convolution(4x32x192x256, 64x32x3x3) 2025-09-07T11:03:28.3927066Z strides: [1572864, 1, 8192, 32], [288, 1, 96, 32] 2025-09-07T11:03:28.3927415Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:28.3927734Z convolution 0.0655 ms 100.0% 2025-09-07T11:03:28.3928618Z triton_convolution2d_33 0.0819 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.3930115Z triton_convolution2d_36 0.0942 ms 69.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.3931598Z triton_convolution2d_31 0.0952 ms 68.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.3933321Z triton_convolution2d_34 0.0973 ms 67.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.3934783Z triton_convolution2d_35 0.0983 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.3936266Z triton_convolution2d_30 0.1946 ms 33.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.3937834Z triton_convolution2d_32 0.2079 ms 31.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:28.3939020Z SingleProcess AUTOTUNE benchmarking takes 0.2105 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:28.5842283Z Autotune Choices Stats: 2025-09-07T11:03:28.5844115Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.062463998794555664, "best_triton_pos": 1, "best_triton_time": 0.07168000191450119, "best_triton_kernel": "triton_convolution2d_40", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:28.5993060Z AUTOTUNE convolution(4x64x192x256, 128x64x3x3) 2025-09-07T11:03:28.5993484Z strides: [3145728, 1, 16384, 64], [576, 1, 192, 64] 2025-09-07T11:03:28.5993836Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:28.5994166Z convolution 0.0625 ms 100.0% 2025-09-07T11:03:28.5995076Z triton_convolution2d_40 0.0717 ms 87.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.5996582Z triton_convolution2d_43 0.0860 ms 72.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.5998058Z triton_convolution2d_41 0.0901 ms 69.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.6001913Z triton_convolution2d_42 0.0901 ms 69.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:28.6003542Z triton_convolution2d_38 0.1034 ms 60.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.6005040Z triton_convolution2d_37 0.1270 ms 49.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:28.6006526Z triton_convolution2d_39 0.2673 ms 23.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:28.6007706Z SingleProcess AUTOTUNE benchmarking takes 0.2049 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:28.8734920Z Autotune Choices Stats: 2025-09-07T11:03:28.8736445Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_60", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.023552000522613525, "best_triton_pos": 0} 2025-09-07T11:03:28.8885286Z AUTOTUNE mm(49152x128, 128x64) 2025-09-07T11:03:28.8885612Z strides: [128, 1], [1, 128] 2025-09-07T11:03:28.8885914Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:28.8886709Z triton_mm_60 0.0236 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.8887917Z triton_mm_51 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:28.8889104Z triton_mm_55 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.8890288Z triton_mm_57 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.8891452Z triton_mm_54 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:28.8892633Z triton_mm_56 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T11:03:28.8893380Z mm 0.0266 ms 88.5% 2025-09-07T11:03:28.8894305Z triton_mm_52 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:28.8895478Z triton_mm_53 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:28.8896651Z triton_mm_58 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:28.8897746Z SingleProcess AUTOTUNE benchmarking takes 0.2873 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T11:03:29.0689174Z Autotune Choices Stats: 2025-09-07T11:03:29.0690831Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04915200173854828, "best_triton_pos": 1, "best_triton_time": 0.06143999844789505, "best_triton_kernel": "triton_convolution2d_65", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:29.0839657Z AUTOTUNE convolution(4x64x96x128, 128x64x3x3) 2025-09-07T11:03:29.0840065Z strides: [786432, 1, 8192, 64], [576, 1, 192, 64] 2025-09-07T11:03:29.0840426Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:29.0840745Z convolution 0.0492 ms 100.0% 2025-09-07T11:03:29.0841636Z triton_convolution2d_65 0.0614 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.0843368Z triton_convolution2d_68 0.0686 ms 71.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.0844868Z triton_convolution2d_67 0.0788 ms 62.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.0846591Z triton_convolution2d_66 0.0819 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.0848082Z triton_convolution2d_63 0.0870 ms 56.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.0849563Z triton_convolution2d_62 0.1229 ms 40.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.0851035Z triton_convolution2d_64 0.2324 ms 21.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:29.0852217Z SingleProcess AUTOTUNE benchmarking takes 0.1949 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:29.2711015Z Autotune Choices Stats: 2025-09-07T11:03:29.2712672Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04608000069856644, "best_triton_pos": 1, "best_triton_time": 0.06451199948787689, "best_triton_kernel": "triton_convolution2d_97", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:29.2866914Z AUTOTUNE convolution(4x128x96x128, 256x128x3x3) 2025-09-07T11:03:29.2867326Z strides: [1572864, 1, 16384, 128], [1152, 1, 384, 128] 2025-09-07T11:03:29.2867892Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:29.2868207Z convolution 0.0461 ms 100.0% 2025-09-07T11:03:29.2869109Z triton_convolution2d_97 0.0645 ms 71.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.2870624Z triton_convolution2d_100 0.0666 ms 69.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.2872122Z triton_convolution2d_99 0.0676 ms 68.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.2873683Z triton_convolution2d_98 0.0819 ms 56.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.2875175Z triton_convolution2d_95 0.0829 ms 55.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.2876664Z triton_convolution2d_94 0.1034 ms 44.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.2878132Z triton_convolution2d_96 0.2458 ms 18.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:29.2879306Z SingleProcess AUTOTUNE benchmarking takes 0.1916 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:29.5470259Z Autotune Choices Stats: 2025-09-07T11:03:29.5471407Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_110", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T11:03:29.5625316Z AUTOTUNE mm(12288x256, 256x128) 2025-09-07T11:03:29.5625951Z strides: [256, 1], [1, 256] 2025-09-07T11:03:29.5626263Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:29.5627035Z triton_mm_110 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:29.5628242Z triton_mm_112 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:29.5629458Z triton_mm_117 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:29.5630660Z triton_mm_111 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:29.5631851Z triton_mm_113 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:29.5633037Z triton_mm_114 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:29.5634226Z triton_mm_116 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:29.5635504Z triton_mm_118 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:29.5636258Z mm 0.0184 ms 88.9% 2025-09-07T11:03:29.5636950Z triton_mm_107 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.5637989Z SingleProcess AUTOTUNE benchmarking takes 0.2741 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:29.7359745Z Autotune Choices Stats: 2025-09-07T11:03:29.7361400Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04095999896526337, "best_triton_pos": 1, "best_triton_time": 0.058368001133203506, "best_triton_kernel": "triton_convolution2d_122", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T11:03:29.7512777Z AUTOTUNE convolution(4x128x48x64, 256x128x3x3) 2025-09-07T11:03:29.7513197Z strides: [393216, 1, 8192, 128], [1152, 1, 384, 128] 2025-09-07T11:03:29.7513570Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:29.7513890Z convolution 0.0410 ms 100.0% 2025-09-07T11:03:29.7514800Z triton_convolution2d_122 0.0584 ms 70.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.7516306Z triton_convolution2d_124 0.0635 ms 64.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.7517791Z triton_convolution2d_125 0.0635 ms 64.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:29.7519275Z triton_convolution2d_123 0.0758 ms 54.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.7520963Z triton_convolution2d_120 0.0799 ms 51.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.7522467Z triton_convolution2d_119 0.1014 ms 40.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:29.7524080Z triton_convolution2d_121 0.2324 ms 17.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:29.7525266Z SingleProcess AUTOTUNE benchmarking takes 0.1881 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:30.0179303Z Autotune Choices Stats: 2025-09-07T11:03:30.0180947Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.07680000364780426, "best_triton_kernel": "triton_convolution2d_305", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:03:30.0331141Z AUTOTUNE convolution(4x256x48x64, 512x256x3x3) 2025-09-07T11:03:30.0331541Z strides: [786432, 1, 16384, 256], [2304, 1, 768, 256] 2025-09-07T11:03:30.0331915Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:30.0332238Z convolution 0.0420 ms 100.0% 2025-09-07T11:03:30.0333293Z triton_convolution2d_305 0.0768 ms 54.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.0334801Z triton_convolution2d_304 0.1075 ms 39.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.0336291Z triton_convolution2d_307 0.1096 ms 38.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.0337852Z triton_convolution2d_306 0.1126 ms 37.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.0339434Z triton_convolution2d_302 0.1434 ms 29.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.0340933Z triton_convolution2d_301 0.1464 ms 28.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.0342445Z triton_convolution2d_303 0.2519 ms 16.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:30.0343618Z SingleProcess AUTOTUNE benchmarking takes 0.2169 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:30.2898168Z Autotune Choices Stats: 2025-09-07T11:03:30.2899361Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_319", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T11:03:30.3049097Z AUTOTUNE mm(3072x512, 512x256) 2025-09-07T11:03:30.3049488Z strides: [512, 1], [1, 512] 2025-09-07T11:03:30.3049791Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:30.3050815Z triton_mm_319 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:30.3051582Z mm 0.0154 ms 93.3% 2025-09-07T11:03:30.3052273Z triton_mm_315 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:30.3053464Z triton_mm_318 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:30.3054654Z triton_mm_321 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:30.3055830Z triton_mm_325 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:30.3057028Z triton_mm_316 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:30.3058260Z triton_mm_317 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:30.3059450Z triton_mm_320 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:30.3060732Z triton_mm_324 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:30.3061763Z SingleProcess AUTOTUNE benchmarking takes 0.2701 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:30.5032204Z Autotune Choices Stats: 2025-09-07T11:03:30.5033872Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03993599861860275, "best_triton_pos": 1, "best_triton_time": 0.07372800260782242, "best_triton_kernel": "triton_convolution2d_330", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:03:30.5183005Z AUTOTUNE convolution(4x256x24x32, 512x256x3x3) 2025-09-07T11:03:30.5183429Z strides: [196608, 1, 8192, 256], [2304, 1, 768, 256] 2025-09-07T11:03:30.5183797Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:30.5184109Z convolution 0.0399 ms 100.0% 2025-09-07T11:03:30.5184996Z triton_convolution2d_330 0.0737 ms 54.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.5186502Z triton_convolution2d_329 0.1024 ms 39.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.5188003Z triton_convolution2d_332 0.1055 ms 37.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.5189495Z triton_convolution2d_331 0.1096 ms 36.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.5190989Z triton_convolution2d_327 0.1382 ms 28.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.5192668Z triton_convolution2d_326 0.1423 ms 28.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.5194163Z triton_convolution2d_328 0.2345 ms 17.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:30.5195351Z SingleProcess AUTOTUNE benchmarking takes 0.2128 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:30.8418588Z Autotune Choices Stats: 2025-09-07T11:03:30.8420255Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.050175998359918594, "best_triton_pos": 1, "best_triton_time": 0.13516800105571747, "best_triton_kernel": "triton_convolution2d_512", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:03:30.8566342Z AUTOTUNE convolution(4x512x24x32, 1024x512x3x3) 2025-09-07T11:03:30.8566767Z strides: [393216, 1, 16384, 512], [4608, 1, 1536, 512] 2025-09-07T11:03:30.8567137Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:30.8567461Z convolution 0.0502 ms 100.0% 2025-09-07T11:03:30.8568367Z triton_convolution2d_512 0.1352 ms 37.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.8570099Z triton_convolution2d_511 0.2048 ms 24.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.8571604Z triton_convolution2d_513 0.2120 ms 23.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.8573094Z triton_convolution2d_514 0.2161 ms 23.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:30.8574570Z triton_convolution2d_508 0.2734 ms 18.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.8576139Z triton_convolution2d_509 0.2785 ms 18.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:30.8577721Z triton_convolution2d_510 0.4342 ms 11.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:30.8578901Z SingleProcess AUTOTUNE benchmarking takes 0.2723 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:31.1229843Z Autotune Choices Stats: 2025-09-07T11:03:31.1231014Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_523", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T11:03:31.1382454Z AUTOTUNE mm(768x1024, 1024x512) 2025-09-07T11:03:31.1382805Z strides: [1024, 1], [1, 1024] 2025-09-07T11:03:31.1383111Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:31.1383900Z triton_mm_523 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:31.1384998Z mm 0.0154 ms 93.3% 2025-09-07T11:03:31.1385703Z triton_mm_522 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:31.1386884Z triton_mm_519 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:31.1388086Z triton_mm_526 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:31.1389275Z triton_mm_516 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:31.1390455Z triton_mm_517 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:31.1391630Z triton_mm_518 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:31.1392804Z triton_mm_525 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:31.1393995Z triton_mm_528 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:31.1395107Z SingleProcess AUTOTUNE benchmarking takes 0.2798 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:31.3969917Z Autotune Choices Stats: 2025-09-07T11:03:31.3971619Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.048128001391887665, "best_triton_pos": 1, "best_triton_time": 0.13414399325847626, "best_triton_kernel": "triton_convolution2d_537", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T11:03:31.4123083Z AUTOTUNE convolution(4x512x12x16, 1024x512x3x3) 2025-09-07T11:03:31.4123743Z strides: [98304, 1, 8192, 512], [4608, 1, 1536, 512] 2025-09-07T11:03:31.4124113Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:31.4124426Z convolution 0.0481 ms 100.0% 2025-09-07T11:03:31.4125313Z triton_convolution2d_537 0.1341 ms 35.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:31.4126825Z triton_convolution2d_536 0.1956 ms 24.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:31.4128316Z triton_convolution2d_538 0.2089 ms 23.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:31.4129801Z triton_convolution2d_539 0.2130 ms 22.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T11:03:31.4131290Z triton_convolution2d_533 0.2693 ms 17.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:31.4132962Z triton_convolution2d_534 0.2734 ms 17.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T11:03:31.4134454Z triton_convolution2d_535 0.4076 ms 11.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T11:03:31.4135631Z SingleProcess AUTOTUNE benchmarking takes 0.2735 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T11:03:31.7554098Z Autotune Choices Stats: 2025-09-07T11:03:31.7555306Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_666", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01945599913597107, "best_triton_pos": 0} 2025-09-07T11:03:31.7707736Z AUTOTUNE mm(768x2048, 2048x512) 2025-09-07T11:03:31.7708056Z strides: [2048, 1], [1, 2048] 2025-09-07T11:03:31.7708361Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:31.7709144Z triton_mm_666 0.0195 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:31.7709904Z mm 0.0205 ms 95.0% 2025-09-07T11:03:31.7710598Z triton_mm_665 0.0256 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:31.7711775Z triton_mm_662 0.0266 ms 73.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:31.7713201Z triton_mm_669 0.0287 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:31.7714395Z triton_mm_659 0.0317 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:31.7715578Z triton_mm_668 0.0317 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:31.7716753Z triton_mm_660 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:31.7718031Z triton_mm_661 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:31.7719214Z triton_mm_671 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:31.7720238Z SingleProcess AUTOTUNE benchmarking takes 0.3148 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:32.0872360Z Autotune Choices Stats: 2025-09-07T11:03:32.0873532Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_712", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T11:03:32.1028234Z AUTOTUNE addmm(768x255, 768x1024, 1024x255) 2025-09-07T11:03:32.1028585Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T11:03:32.1028951Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:03:32.1029755Z triton_mm_712 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.1031268Z triton_mm_710 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.1032460Z triton_mm_711 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.1033651Z triton_mm_716 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.1034841Z triton_mm_709 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:32.1036025Z triton_mm_715 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:32.1037191Z triton_mm_718 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.1038371Z triton_mm_719 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.1039557Z triton_mm_721 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.1040315Z bias_addmm 0.0256 ms 52.0% 2025-09-07T11:03:32.1040960Z SingleProcess AUTOTUNE benchmarking takes 0.3164 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T11:03:32.3510151Z Autotune Choices Stats: 2025-09-07T11:03:32.3511363Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_730", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T11:03:32.3665502Z AUTOTUNE mm(768x512, 512x256) 2025-09-07T11:03:32.3665833Z strides: [512, 1], [1, 512] 2025-09-07T11:03:32.3666132Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:32.3666893Z triton_mm_730 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.3667845Z mm 0.0113 ms 90.9% 2025-09-07T11:03:32.3668542Z triton_mm_728 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.3669750Z triton_mm_729 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.3670952Z triton_mm_734 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.3672139Z triton_mm_727 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:32.3673303Z triton_mm_733 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:32.3674479Z triton_mm_737 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.3675668Z triton_mm_736 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.3677016Z triton_mm_739 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.3678049Z SingleProcess AUTOTUNE benchmarking takes 0.2631 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:32.6343051Z Autotune Choices Stats: 2025-09-07T11:03:32.6344518Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01740800030529499, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_755", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T11:03:32.6497384Z AUTOTUNE mm(3072x768, 768x256) 2025-09-07T11:03:32.6497686Z strides: [768, 1], [1, 768] 2025-09-07T11:03:32.6497986Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:32.6498297Z mm 0.0174 ms 100.0% 2025-09-07T11:03:32.6499032Z triton_mm_755 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.6500230Z triton_mm_751 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:32.6501417Z triton_mm_754 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.6502772Z triton_mm_757 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.6503964Z triton_mm_752 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.6505160Z triton_mm_761 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.6506354Z triton_mm_760 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.6507628Z triton_mm_753 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.6508810Z triton_mm_756 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.6509839Z SingleProcess AUTOTUNE benchmarking takes 0.2826 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:32.9709688Z Autotune Choices Stats: 2025-09-07T11:03:32.9710880Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_826", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T11:03:32.9874903Z AUTOTUNE addmm(3072x255, 3072x512, 512x255) 2025-09-07T11:03:32.9875282Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T11:03:32.9875632Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:03:32.9876435Z triton_mm_826 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:32.9877635Z triton_mm_829 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.9879087Z triton_mm_832 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:32.9880289Z triton_mm_830 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.9881473Z triton_mm_828 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.9882658Z triton_mm_831 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:32.9884084Z triton_mm_827 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:32.9885255Z triton_mm_822 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.9886440Z triton_mm_836 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:32.9887629Z triton_mm_820 0.0236 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:32.9888753Z SingleProcess AUTOTUNE benchmarking takes 0.3109 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T11:03:33.2314506Z Autotune Choices Stats: 2025-09-07T11:03:33.2315693Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_844", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T11:03:33.2477352Z AUTOTUNE mm(3072x256, 256x128) 2025-09-07T11:03:33.2477718Z strides: [256, 1], [1, 256] 2025-09-07T11:03:33.2478016Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:33.2478784Z triton_mm_844 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:33.2480163Z triton_mm_840 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:33.2481362Z triton_mm_845 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T11:03:33.2482563Z triton_mm_847 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.2483963Z triton_mm_848 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.2485152Z triton_mm_850 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.2485909Z mm 0.0123 ms 83.3% 2025-09-07T11:03:33.2486601Z triton_mm_838 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:33.2487776Z triton_mm_839 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:33.2489118Z triton_mm_846 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.2490159Z SingleProcess AUTOTUNE benchmarking takes 0.2596 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:33.5164738Z Autotune Choices Stats: 2025-09-07T11:03:33.5165901Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_866", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T11:03:33.5326144Z AUTOTUNE mm(12288x384, 384x128) 2025-09-07T11:03:33.5326467Z strides: [384, 1], [1, 384] 2025-09-07T11:03:33.5326765Z dtypes: torch.float16, torch.float16 2025-09-07T11:03:33.5327540Z triton_mm_866 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.5328759Z triton_mm_871 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.5329971Z triton_mm_872 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:33.5331169Z triton_mm_865 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.5332117Z mm 0.0225 ms 90.9% 2025-09-07T11:03:33.5332810Z triton_mm_864 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.5334006Z triton_mm_867 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.5335179Z triton_mm_868 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.5336445Z triton_mm_870 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.5337715Z triton_mm_862 0.0246 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:33.5338747Z SingleProcess AUTOTUNE benchmarking takes 0.2842 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T11:03:33.8812576Z Autotune Choices Stats: 2025-09-07T11:03:33.8813812Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_941", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.02969600073993206, "best_triton_pos": 0} 2025-09-07T11:03:33.8976595Z AUTOTUNE addmm(12288x255, 12288x256, 256x255) 2025-09-07T11:03:33.8976978Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T11:03:33.8977393Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T11:03:33.8978215Z triton_mm_941 0.0297 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.8979439Z triton_mm_945 0.0297 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.8981011Z triton_mm_937 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T11:03:33.8982211Z triton_mm_939 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.8983402Z triton_mm_940 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.8984600Z triton_mm_943 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T11:03:33.8985780Z triton_mm_936 0.0317 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T11:03:33.8986960Z triton_mm_942 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.8988153Z triton_mm_946 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T11:03:33.8989346Z triton_mm_947 0.0348 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T11:03:33.8990466Z SingleProcess AUTOTUNE benchmarking takes 0.3391 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T11:03:34.0212001Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:03:34.0212894Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T11:03:34.0213652Z pred = mod(*cloned_inputs) 2025-09-07T11:03:34.0214166Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T11:03:34.0214699Z return self.forward_once(x) 2025-09-07T11:03:34.0215218Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T11:03:34.0215775Z yolo_out.append(module(x, out)) 2025-09-07T11:03:34.0216274Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T11:03:34.0217073Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T11:03:34.0217750Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T11:03:34.0218339Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T11:03:34.0218591Z 2025-09-07T11:03:34.0218752Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:03:34.0219575Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T11:03:34.0220339Z pred = mod(*cloned_inputs) 2025-09-07T11:03:34.0220827Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T11:03:34.0221353Z return self.forward_once(x) 2025-09-07T11:03:34.0221853Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T11:03:34.0222415Z yolo_out.append(module(x, out)) 2025-09-07T11:03:34.0222919Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T11:03:34.0223489Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T11:03:34.0224052Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T11:03:34.0224638Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T11:03:34.0224901Z 2025-09-07T11:03:34.0225074Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:03:34.0226047Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T11:03:34.0226812Z pred = mod(*cloned_inputs) 2025-09-07T11:03:34.0227290Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T11:03:34.0227814Z return self.forward_once(x) 2025-09-07T11:03:34.0228331Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T11:03:34.0228892Z yolo_out.append(module(x, out)) 2025-09-07T11:03:34.0229403Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T11:03:34.0229956Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T11:03:34.0230535Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T11:03:34.0231117Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T11:03:34.0231368Z 2025-09-07T11:03:34.2914071Z cudagraph partition into 2 partitions 2025-09-07T11:04:19.8159748Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T11:04:19.8161006Z pred = mod(*cloned_inputs) 2025-09-07T11:04:19.8161508Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T11:04:19.8162054Z return self.forward_once(x) 2025-09-07T11:04:19.8162575Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T11:04:19.8163615Z x = module(x) 2025-09-07T11:04:19.8163759Z 2025-09-07T11:04:19.8163775Z 2025-09-07T11:04:22.9726436Z W0907 11:04:22.971000 141572 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T11:05:02.0860745Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in torch_dynamo_resume_in_forward_and_backward_pass_at_486 2025-09-07T11:05:02.0862036Z pred = mod(*cloned_inputs) 2025-09-07T11:05:02.0862547Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T11:05:02.0863084Z return self.forward_once(x) 2025-09-07T11:05:02.0863903Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 374, in forward_once 2025-09-07T11:05:02.0864457Z x = module(x) 2025-09-07T11:05:02.0864617Z 2025-09-07T11:05:02.0864622Z 2025-09-07T11:05:03.6833355Z pass 2025-09-07T11:05:09.8412469Z accuracy pass_rate=88.24% 2025-09-07T11:05:09.8419416Z calls_captured gmean=0.00x mean=779.588x 2025-09-07T11:05:09.8423753Z unique_graphs gmean=0.00x mean=5.941x 2025-09-07T11:05:09.8428068Z graph_breaks gmean=0.00x mean=8.294x 2025-09-07T11:05:09.8432362Z unique_graph_breaks gmean=0.00x mean=4.765x 2025-09-07T11:05:09.8436717Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T11:05:09.8441020Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T11:05:09.8445671Z cudagraph_skips gmean=0.00x mean=1.059x 2025-09-07T11:05:09.8447037Z compilation_latency mean=73.183 seconds 2025-09-07T11:05:10.6980975Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs_low_precision-true* ]] 2025-09-07T11:05:10.6982524Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T11:05:10.6982854Z + for target in "${targets[@]}" 2025-09-07T11:05:10.6983154Z + target_flag=('--performance') 2025-09-07T11:05:10.6983429Z + local target_flag 2025-09-07T11:05:10.6983703Z + [[ performance == \p\e\r\f\o\r\m\a\n\c\e ]] 2025-09-07T11:05:10.6984055Z + target_flag+=(--cold-start-latency) 2025-09-07T11:05:10.6985893Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing-true* ]] 2025-09-07T11:05:10.6988318Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *default-true* ]] 2025-09-07T11:05:10.6990919Z + python benchmarks/dynamo/torchbench.py --performance --cold-start-latency --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance.csv 2025-09-07T11:05:11.3001343Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T11:05:11.3002748Z import pynvml # type: ignore[import] 2025-09-07T11:05:15.2406011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T11:05:15.2407542Z import pynvml # type: ignore[import] 2025-09-07T11:05:17.9497615Z 2025-09-07T11:05:18.9077270Z loading model: 0it [00:00, ?it/s] 2025-09-07T11:05:18.9077661Z loading model: 0it [00:00, ?it/s] 2025-09-07T11:05:18.9077996Z cuda train soft_actor_critic 2025-09-07T11:05:27.0072548Z W0907 11:05:27.006000 148001 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T11:05:28.4957765Z 2025-09-07T11:05:28.5998730Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:12:49.4455619Z 2025-09-07T11:12:52.8750357Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:18:22.2657777Z 2025-09-07T11:18:22.3740827Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:21:15.4205367Z 2025-09-07T11:21:15.5950766Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:25:49.7140241Z 2025-09-07T11:25:50.7066055Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:29:00.7146767Z 2025-09-07T11:29:00.8562393Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:29:22.7879777Z 2025-09-07T11:29:22.8987227Z running benchmark: 0% 0/30 [00:00 2025-09-07T11:30:29.6228758Z W0907 11:30:29.619000 199576 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T11:30:29.6230191Z W0907 11:30:29.619000 199576 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T11:30:29.6231638Z W0907 11:30:29.619000 199576 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T11:30:29.6232552Z W0907 11:30:29.619000 199576 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:30:29.6233236Z W0907 11:30:29.619000 199576 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:31:19.7492451Z W0907 11:31:19.748000 199576 site-packages/torch/_logging/_internal.py:1199] [50/0] Profiler function will be ignored 2025-09-07T11:31:32.9919345Z W0907 11:31:32.991000 199576 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T11:31:32.9921840Z W0907 11:31:32.991000 199576 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] function: 'roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:203) 2025-09-07T11:31:32.9924897Z W0907 11:31:32.991000 199576 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] last reason: 29/7: tensor 'boxes' requires_grad mismatch. expected requires_grad=1 2025-09-07T11:31:32.9927207Z W0907 11:31:32.991000 199576 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T11:31:32.9929766Z W0907 11:31:32.991000 199576 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T11:31:49.2508156Z 2025-09-07T11:31:49.3548949Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:34:02.0319345Z 2025-09-07T11:34:02.1399085Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:34:29.8172154Z 2025-09-07T11:34:29.9190408Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:41:53.0787563Z 2025-09-07T11:41:53.9310341Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:47:30.6111973Z 2025-09-07T11:47:30.7261750Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:50:29.2788263Z 2025-09-07T11:50:29.5475752Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:55:10.1972045Z 2025-09-07T11:55:11.2662183Z running benchmark: 0% 0/30 [00:00 2025-09-07T11:55:11.2666047Z torchbench_main() 2025-09-07T11:55:11.2667020Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 504, in torchbench_main 2025-09-07T11:55:11.2668194Z main(TorchBenchmarkRunner(), original_dir) 2025-09-07T11:55:11.2669147Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3636, in main 2025-09-07T11:55:11.2670130Z process_entry(0, runner, original_dir, args) 2025-09-07T11:55:11.2671202Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3561, in process_entry 2025-09-07T11:55:11.2673810Z result = run(runner, args, original_dir) 2025-09-07T11:55:11.2674757Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4279, in run 2025-09-07T11:55:11.2678526Z runner.run_one_model( 2025-09-07T11:55:11.2679455Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2886, in run_one_model 2025-09-07T11:55:11.2681846Z status = self.run_performance_test( 2025-09-07T11:55:11.2683216Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2796, in run_performance_test 2025-09-07T11:55:11.2685946Z results.append(experiment(model, example_inputs, **experiment_kwargs)) 2025-09-07T11:55:11.2687253Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1124, in speedup_experiment 2025-09-07T11:55:11.2688379Z timings[rep, 0], expected_output = timed( 2025-09-07T11:55:11.2689751Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 751, in timed 2025-09-07T11:55:11.2690967Z result = model_iter_fn(model, example_inputs, collect_outputs=collect_outputs) 2025-09-07T11:55:11.2692388Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in forward_and_backward_pass 2025-09-07T11:55:11.2693551Z pred = mod(*cloned_inputs) 2025-09-07T11:55:11.2694691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T11:55:11.2695936Z return self._call_impl(*args, **kwargs) 2025-09-07T11:55:11.2697167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T11:55:11.2698337Z return forward_call(*args, **kwargs) 2025-09-07T11:55:11.2699477Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 849, in forward 2025-09-07T11:55:11.2700647Z x = self.forward_features(x) 2025-09-07T11:55:11.2701847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 830, in forward_features 2025-09-07T11:55:11.2703095Z x = self.blocks(x) 2025-09-07T11:55:11.2704175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T11:55:11.2705469Z return self._call_impl(*args, **kwargs) 2025-09-07T11:55:11.2706644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T11:55:11.2708076Z return forward_call(*args, **kwargs) 2025-09-07T11:55:11.2709255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward 2025-09-07T11:55:11.2710383Z input = module(input) 2025-09-07T11:55:11.2711482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T11:55:11.2712729Z return self._call_impl(*args, **kwargs) 2025-09-07T11:55:11.2713844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T11:55:11.2715023Z return forward_call(*args, **kwargs) 2025-09-07T11:55:11.2716170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 170, in forward 2025-09-07T11:55:11.2717637Z x = x + self.drop_path2(self.ls2(self.mlp(self.norm2(x)))) 2025-09-07T11:55:11.2718954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T11:55:11.2720187Z return self._call_impl(*args, **kwargs) 2025-09-07T11:55:11.2721347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T11:55:11.2722507Z return forward_call(*args, **kwargs) 2025-09-07T11:55:11.2723718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/layers/mlp.py", line 45, in forward 2025-09-07T11:55:11.2724748Z x = self.act(x) 2025-09-07T11:55:11.2725832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T11:55:11.2727113Z return self._call_impl(*args, **kwargs) 2025-09-07T11:55:11.2728298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T11:55:11.2729475Z return forward_call(*args, **kwargs) 2025-09-07T11:55:11.2730605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 816, in forward 2025-09-07T11:55:11.2731852Z return F.gelu(input, approximate=self.approximate) 2025-09-07T11:55:11.2737169Z torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 98.00 MiB. GPU 0 has a total capacity of 39.38 GiB of which 43.38 MiB is free. Including non-PyTorch memory, this process has 0 bytes memory in use. Of the allocated memory 38.06 GiB is allocated by PyTorch, with 19.06 GiB allocated in private pools (e.g., CUDA Graphs), and 735.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) 2025-09-07T11:55:17.7333089Z Run failed with return code: 1 2025-09-07T11:55:17.7333577Z Output: None 2025-09-07T11:55:17.7333879Z Error: None 2025-09-07T11:55:18.3323586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T11:55:18.3325018Z import pynvml # type: ignore[import] 2025-09-07T11:55:21.3143476Z 2025-09-07T11:55:24.4568681Z loading model: 0it [00:00, ?it/s] 2025-09-07T11:55:24.4569410Z loading model: 0it [00:03, ?it/s] 2025-09-07T11:55:24.4569980Z cuda train timm_vovnet 2025-09-07T11:56:05.9259625Z 2025-09-07T11:56:06.0710501Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:58:01.7038915Z 2025-09-07T11:58:01.8789916Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:58:24.4176780Z 2025-09-07T11:58:24.5237597Z running benchmark: 0% 0/30 [00:00 2025-09-07T11:59:30.0841813Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0842230Z 2025-09-07T11:59:30.0842425Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:59:30.0843437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:59:30.0844150Z anchors = self.anchor_generator(images, features) 2025-09-07T11:59:30.0844905Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:59:30.0846035Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:59:30.0846840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:59:30.0847778Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0848675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:59:30.0849585Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0850004Z 2025-09-07T11:59:30.0850188Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:59:30.0850920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:59:30.0851636Z anchors = self.anchor_generator(images, features) 2025-09-07T11:59:30.0852374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:59:30.0853127Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:59:30.0853916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:59:30.0854851Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0855754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:59:30.0856749Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0857236Z 2025-09-07T11:59:30.0857421Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:59:30.0858166Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:59:30.0858886Z anchors = self.anchor_generator(images, features) 2025-09-07T11:59:30.0859636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:59:30.0860380Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:59:30.0861168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:59:30.0862202Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0863110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:59:30.0864018Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0864493Z 2025-09-07T11:59:30.0864688Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T11:59:30.0865423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T11:59:30.0866137Z anchors = self.anchor_generator(images, features) 2025-09-07T11:59:30.0866887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T11:59:30.0867659Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T11:59:30.0868438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T11:59:30.0869369Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0870271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T11:59:30.0871287Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T11:59:30.0871694Z 2025-09-07T11:59:30.1057403Z cudagraph partition into 2 partitions 2025-09-07T11:59:34.8273858Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T11:59:34.8274987Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T11:59:34.8275910Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T11:59:34.8276786Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T11:59:34.8277825Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T11:59:34.8278712Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:59:34.8279511Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T11:59:34.8281066Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T11:59:34.8283177Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T11:59:34.8285115Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T11:59:34.8286599Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T11:59:34.8288006Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T11:59:34.8289328Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T11:59:34.8294471Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T11:59:34.8295931Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T11:59:34.8297470Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T11:59:34.8298907Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T11:59:34.8300324Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T11:59:34.8301709Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T11:59:34.8302621Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:59:34.8303297Z W0907 11:59:34.826000 261036 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T11:59:35.1262366Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:59:35.1263235Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T11:59:35.1264155Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T11:59:35.1264520Z 2025-09-07T11:59:39.3853101Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T11:59:39.3853987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T11:59:39.3854928Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T11:59:39.3855300Z 2025-09-07T12:00:24.8053182Z W0907 12:00:24.804000 261036 site-packages/torch/_logging/_internal.py:1199] [50/0] Profiler function will be ignored 2025-09-07T12:00:38.5587836Z W0907 12:00:38.558000 261036 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T12:00:38.5589158Z W0907 12:00:38.558000 261036 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] function: 'roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:203) 2025-09-07T12:00:38.5590544Z W0907 12:00:38.558000 261036 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] last reason: 29/7: tensor 'boxes' requires_grad mismatch. expected requires_grad=1 2025-09-07T12:00:38.5592067Z W0907 12:00:38.558000 261036 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T12:00:38.5593410Z W0907 12:00:38.558000 261036 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T12:00:54.5382554Z 2025-09-07T12:00:54.6488149Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:03:16.4351487Z 2025-09-07T12:03:16.5886371Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:03:51.5189499Z 2025-09-07T12:03:51.6213169Z running benchmark: 0% 0/30 [00:00 2025-09-07T12:03:59.6805080Z torchbench_main() 2025-09-07T12:03:59.6805619Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 504, in torchbench_main 2025-09-07T12:03:59.6806659Z main(TorchBenchmarkRunner(), original_dir) 2025-09-07T12:03:59.6807203Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3636, in main 2025-09-07T12:03:59.6813408Z process_entry(0, runner, original_dir, args) 2025-09-07T12:03:59.6813995Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3561, in process_entry 2025-09-07T12:03:59.6820208Z result = run(runner, args, original_dir) 2025-09-07T12:03:59.6820718Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4251, in run 2025-09-07T12:03:59.6828820Z assert marked, f"nothing in example_inputs had a dim with {batch_size}" 2025-09-07T12:03:59.6829404Z AssertionError: nothing in example_inputs had a dim with 32 2025-09-07T12:04:00.6401274Z Run failed with return code: 1 2025-09-07T12:04:00.6401651Z Output: None 2025-09-07T12:04:00.6401868Z Error: None 2025-09-07T12:04:01.2526155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T12:04:01.2528640Z import pynvml # type: ignore[import] 2025-09-07T12:04:04.1753080Z 2025-09-07T12:04:05.7969344Z loading model: 0it [00:00, ?it/s] 2025-09-07T12:04:05.7969732Z loading model: 0it [00:01, ?it/s] 2025-09-07T12:04:05.7970072Z cuda train squeezenet1_1 2025-09-07T12:04:32.5263604Z 2025-09-07T12:04:32.7761930Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:10:19.3185770Z 2025-09-07T12:10:20.1743138Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:16:03.9055636Z 2025-09-07T12:16:05.4535941Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:19:13.0349622Z 2025-09-07T12:19:14.0747333Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:24:08.3374678Z 2025-09-07T12:24:09.4358753Z running benchmark: 0% 0/30 [00:00 2025-09-07T12:24:09.4360977Z torchbench_main() 2025-09-07T12:24:09.4361515Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 504, in torchbench_main 2025-09-07T12:24:09.4362473Z main(TorchBenchmarkRunner(), original_dir) 2025-09-07T12:24:09.4363150Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3636, in main 2025-09-07T12:24:09.4367178Z process_entry(0, runner, original_dir, args) 2025-09-07T12:24:09.4367824Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3561, in process_entry 2025-09-07T12:24:09.4370914Z result = run(runner, args, original_dir) 2025-09-07T12:24:09.4371428Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4279, in run 2025-09-07T12:24:09.4375755Z runner.run_one_model( 2025-09-07T12:24:09.4376267Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2886, in run_one_model 2025-09-07T12:24:09.4379352Z status = self.run_performance_test( 2025-09-07T12:24:09.4379953Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2796, in run_performance_test 2025-09-07T12:24:09.4382730Z results.append(experiment(model, example_inputs, **experiment_kwargs)) 2025-09-07T12:24:09.4383463Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1124, in speedup_experiment 2025-09-07T12:24:09.4384090Z timings[rep, 0], expected_output = timed( 2025-09-07T12:24:09.4384620Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 751, in timed 2025-09-07T12:24:09.4385414Z result = model_iter_fn(model, example_inputs, collect_outputs=collect_outputs) 2025-09-07T12:24:09.4386197Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in forward_and_backward_pass 2025-09-07T12:24:09.4387176Z pred = mod(*cloned_inputs) 2025-09-07T12:24:09.4387842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T12:24:09.4388771Z return self._call_impl(*args, **kwargs) 2025-09-07T12:24:09.4389415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T12:24:09.4391061Z return forward_call(*args, **kwargs) 2025-09-07T12:24:09.4391692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 849, in forward 2025-09-07T12:24:09.4392588Z x = self.forward_features(x) 2025-09-07T12:24:09.4393257Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 830, in forward_features 2025-09-07T12:24:09.4393979Z x = self.blocks(x) 2025-09-07T12:24:09.4394598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T12:24:09.4396444Z return self._call_impl(*args, **kwargs) 2025-09-07T12:24:09.4397087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T12:24:09.4398743Z return forward_call(*args, **kwargs) 2025-09-07T12:24:09.4399372Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward 2025-09-07T12:24:09.4400004Z input = module(input) 2025-09-07T12:24:09.4400742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T12:24:09.4401770Z return self._call_impl(*args, **kwargs) 2025-09-07T12:24:09.4402411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T12:24:09.4404450Z return forward_call(*args, **kwargs) 2025-09-07T12:24:09.4405112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 170, in forward 2025-09-07T12:24:09.4405818Z x = x + self.drop_path2(self.ls2(self.mlp(self.norm2(x)))) 2025-09-07T12:24:09.4406557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T12:24:09.4407328Z return self._call_impl(*args, **kwargs) 2025-09-07T12:24:09.4408075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T12:24:09.4409736Z return forward_call(*args, **kwargs) 2025-09-07T12:24:09.4410288Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/layers/mlp.py", line 44, in forward 2025-09-07T12:24:09.4410848Z x = self.fc1(x) 2025-09-07T12:24:09.4411454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T12:24:09.4412519Z return self._call_impl(*args, **kwargs) 2025-09-07T12:24:09.4413161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T12:24:09.4414876Z return forward_call(*args, **kwargs) 2025-09-07T12:24:09.4415491Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 134, in forward 2025-09-07T12:24:09.4416158Z return F.linear(input, self.weight, self.bias) 2025-09-07T12:24:09.4419041Z torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 100.00 MiB. GPU 0 has a total capacity of 39.38 GiB of which 51.38 MiB is free. Including non-PyTorch memory, this process has 0 bytes memory in use. Of the allocated memory 38.10 GiB is allocated by PyTorch, with 19.06 GiB allocated in private pools (e.g., CUDA Graphs), and 690.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) 2025-09-07T12:24:16.3761196Z Run failed with return code: 1 2025-09-07T12:24:16.3761554Z Output: None 2025-09-07T12:24:16.3761767Z Error: None 2025-09-07T12:24:16.9477111Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T12:24:16.9478567Z import pynvml # type: ignore[import] 2025-09-07T12:24:19.6783671Z 2025-09-07T12:24:22.8643106Z loading model: 0it [00:00, ?it/s] 2025-09-07T12:24:22.8643511Z loading model: 0it [00:03, ?it/s] 2025-09-07T12:24:22.8644909Z cuda train timm_vovnet 2025-09-07T12:25:08.9088459Z 2025-09-07T12:25:09.4938526Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:27:12.0288158Z 2025-09-07T12:27:12.2197058Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:27:37.4769009Z 2025-09-07T12:27:37.5874269Z running benchmark: 0% 0/30 [00:00 2025-09-07T12:28:46.7599664Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7600100Z 2025-09-07T12:28:46.7600350Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T12:28:46.7601174Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T12:28:46.7601934Z anchors = self.anchor_generator(images, features) 2025-09-07T12:28:46.7602769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T12:28:46.7604261Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T12:28:46.7605157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T12:28:46.7606176Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7607127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T12:28:46.7608113Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7608582Z 2025-09-07T12:28:46.7608790Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T12:28:46.7609604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T12:28:46.7610393Z anchors = self.anchor_generator(images, features) 2025-09-07T12:28:46.7611186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T12:28:46.7612027Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T12:28:46.7612897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T12:28:46.7613917Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7615007Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T12:28:46.7615951Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7616424Z 2025-09-07T12:28:46.7616630Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T12:28:46.7617526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T12:28:46.7618329Z anchors = self.anchor_generator(images, features) 2025-09-07T12:28:46.7619160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T12:28:46.7619956Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T12:28:46.7620851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T12:28:46.7621968Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7622959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T12:28:46.7623943Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7624373Z 2025-09-07T12:28:46.7624584Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T12:28:46.7625403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T12:28:46.7626201Z anchors = self.anchor_generator(images, features) 2025-09-07T12:28:46.7627038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T12:28:46.7627883Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T12:28:46.7628717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T12:28:46.7629735Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7630727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T12:28:46.7631835Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T12:28:46.7632273Z 2025-09-07T12:28:46.7814406Z cudagraph partition into 2 partitions 2025-09-07T12:28:50.5134186Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T12:28:50.5135409Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T12:28:50.5136412Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T12:28:50.5137327Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T12:28:50.5138550Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T12:28:50.5139518Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T12:28:50.5140408Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T12:28:50.5142007Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T12:28:50.5143882Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T12:28:50.5145855Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T12:28:50.5147426Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T12:28:50.5148912Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T12:28:50.5150317Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T12:28:50.5151853Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T12:28:50.5153372Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T12:28:50.5154884Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T12:28:50.5156358Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T12:28:50.5157864Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T12:28:50.5159298Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T12:28:50.5160297Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T12:28:50.5161061Z W0907 12:28:50.512000 320561 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T12:28:50.8134933Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T12:28:50.8135932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T12:28:50.8136928Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T12:28:50.8137310Z 2025-09-07T12:28:55.5198584Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T12:28:55.5199541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T12:28:55.5200594Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T12:28:55.5201025Z 2025-09-07T12:29:38.9310580Z W0907 12:29:38.929000 320561 site-packages/torch/_logging/_internal.py:1199] [50/0] Profiler function will be ignored 2025-09-07T12:29:52.2071288Z W0907 12:29:52.206000 320561 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T12:29:52.2072704Z W0907 12:29:52.206000 320561 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] function: 'roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:203) 2025-09-07T12:29:52.2074183Z W0907 12:29:52.206000 320561 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] last reason: 29/7: tensor 'boxes' requires_grad mismatch. expected requires_grad=1 2025-09-07T12:29:52.2075810Z W0907 12:29:52.206000 320561 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T12:29:52.2077188Z W0907 12:29:52.206000 320561 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T12:30:07.4301843Z 2025-09-07T12:30:07.5408230Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:32:25.9322750Z 2025-09-07T12:32:26.6473049Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:33:07.9459512Z 2025-09-07T12:33:08.0463678Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:47:22.0863735Z 2025-09-07T12:47:22.5360352Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:58:51.9815092Z 2025-09-07T12:58:52.1043417Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:03:40.0667031Z 2025-09-07T13:03:40.2432707Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:11:48.1861749Z 2025-09-07T13:11:49.1663825Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:17:07.0306599Z 2025-09-07T13:17:07.1688312Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:17:37.6037041Z 2025-09-07T13:17:37.7065532Z running benchmark: 0% 0/30 [00:00 2025-09-07T13:19:13.7099195Z W0907 13:19:13.707000 385400 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T13:19:13.7100623Z W0907 13:19:13.707000 385400 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T13:19:13.7102000Z W0907 13:19:13.707000 385400 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T13:19:13.7102915Z W0907 13:19:13.707000 385400 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T13:19:13.7103596Z W0907 13:19:13.707000 385400 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T13:20:59.5645075Z W0907 13:20:59.563000 385400 site-packages/torch/_logging/_internal.py:1199] [50/0] Profiler function will be ignored 2025-09-07T13:21:20.5290703Z W0907 13:21:20.528000 385400 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T13:21:20.5292912Z W0907 13:21:20.528000 385400 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] function: 'roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:203) 2025-09-07T13:21:20.5294365Z W0907 13:21:20.528000 385400 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] last reason: 29/7: tensor 'boxes' requires_grad mismatch. expected requires_grad=1 2025-09-07T13:21:20.5295569Z W0907 13:21:20.528000 385400 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T13:21:20.5296903Z W0907 13:21:20.528000 385400 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T13:21:51.6990076Z 2025-09-07T13:21:51.8590336Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:25:22.7672614Z 2025-09-07T13:25:22.8892646Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:26:02.1941433Z 2025-09-07T13:26:02.2968107Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:37:52.4730068Z 2025-09-07T13:37:53.3030256Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:44:58.9389659Z 2025-09-07T13:44:59.0530264Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:48:53.2681280Z 2025-09-07T13:48:53.5366783Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:55:01.3433529Z 2025-09-07T13:55:02.4010277Z running benchmark: 0% 0/30 [00:00 2025-09-07T13:55:02.4012567Z torchbench_main() 2025-09-07T13:55:02.4013108Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 504, in torchbench_main 2025-09-07T13:55:02.4013745Z main(TorchBenchmarkRunner(), original_dir) 2025-09-07T13:55:02.4014609Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3636, in main 2025-09-07T13:55:02.4018161Z process_entry(0, runner, original_dir, args) 2025-09-07T13:55:02.4018822Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 3561, in process_entry 2025-09-07T13:55:02.4022474Z result = run(runner, args, original_dir) 2025-09-07T13:55:02.4023053Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 4279, in run 2025-09-07T13:55:02.4027281Z runner.run_one_model( 2025-09-07T13:55:02.4027848Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2886, in run_one_model 2025-09-07T13:55:02.4030715Z status = self.run_performance_test( 2025-09-07T13:55:02.4031363Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2796, in run_performance_test 2025-09-07T13:55:02.4034111Z results.append(experiment(model, example_inputs, **experiment_kwargs)) 2025-09-07T13:55:02.4034977Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1124, in speedup_experiment 2025-09-07T13:55:02.4035599Z timings[rep, 0], expected_output = timed( 2025-09-07T13:55:02.4036129Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 751, in timed 2025-09-07T13:55:02.4037131Z result = model_iter_fn(model, example_inputs, collect_outputs=collect_outputs) 2025-09-07T13:55:02.4037972Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 491, in forward_and_backward_pass 2025-09-07T13:55:02.4038650Z pred = mod(*cloned_inputs) 2025-09-07T13:55:02.4039305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T13:55:02.4040034Z return self._call_impl(*args, **kwargs) 2025-09-07T13:55:02.4040675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T13:55:02.4042498Z return forward_call(*args, **kwargs) 2025-09-07T13:55:02.4043292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 849, in forward 2025-09-07T13:55:02.4044578Z x = self.forward_features(x) 2025-09-07T13:55:02.4045254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 830, in forward_features 2025-09-07T13:55:02.4046143Z x = self.blocks(x) 2025-09-07T13:55:02.4046760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T13:55:02.4048562Z return self._call_impl(*args, **kwargs) 2025-09-07T13:55:02.4049424Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T13:55:02.4051038Z return forward_call(*args, **kwargs) 2025-09-07T13:55:02.4051672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward 2025-09-07T13:55:02.4052307Z input = module(input) 2025-09-07T13:55:02.4052941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T13:55:02.4053894Z return self._call_impl(*args, **kwargs) 2025-09-07T13:55:02.4054526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T13:55:02.4056277Z return forward_call(*args, **kwargs) 2025-09-07T13:55:02.4057042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/vision_transformer.py", line 170, in forward 2025-09-07T13:55:02.4057856Z x = x + self.drop_path2(self.ls2(self.mlp(self.norm2(x)))) 2025-09-07T13:55:02.4058599Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T13:55:02.4059503Z return self._call_impl(*args, **kwargs) 2025-09-07T13:55:02.4060218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T13:55:02.4061956Z return forward_call(*args, **kwargs) 2025-09-07T13:55:02.4062522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/layers/mlp.py", line 45, in forward 2025-09-07T13:55:02.4063085Z x = self.act(x) 2025-09-07T13:55:02.4063669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T13:55:02.4064720Z return self._call_impl(*args, **kwargs) 2025-09-07T13:55:02.4065363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T13:55:02.4067113Z return forward_call(*args, **kwargs) 2025-09-07T13:55:02.4067746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 816, in forward 2025-09-07T13:55:02.4068429Z return F.gelu(input, approximate=self.approximate) 2025-09-07T13:55:02.4071169Z torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 98.00 MiB. GPU 0 has a total capacity of 39.38 GiB of which 15.38 MiB is free. Including non-PyTorch memory, this process has 0 bytes memory in use. Of the allocated memory 38.08 GiB is allocated by PyTorch, with 19.08 GiB allocated in private pools (e.g., CUDA Graphs), and 738.32 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) 2025-09-07T13:55:09.8367889Z Run failed with return code: 1 2025-09-07T13:55:09.8368262Z Output: None 2025-09-07T13:55:09.8368480Z Error: None 2025-09-07T13:55:10.4435955Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T13:55:10.4437443Z import pynvml # type: ignore[import] 2025-09-07T13:55:13.1340279Z 2025-09-07T13:55:16.0999187Z loading model: 0it [00:00, ?it/s] 2025-09-07T13:55:16.0999569Z loading model: 0it [00:02, ?it/s] 2025-09-07T13:55:16.0999902Z cuda train timm_vovnet 2025-09-07T13:55:36.4412622Z Autotune Choices Stats: 2025-09-07T13:55:36.4415006Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_5", "best_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.08294399827718735, "best_triton_pos": 0} 2025-09-07T13:55:36.4565637Z AUTOTUNE convolution(32x3x224x224, 64x3x3x3) 2025-09-07T13:55:36.4566330Z strides: [150528, 50176, 224, 1], [27, 9, 3, 1] 2025-09-07T13:55:36.4566948Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:36.4568628Z triton_convolution2d_5 0.0829 ms 100.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.4571319Z triton_convolution2d_1 0.0891 ms 93.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:36.4574318Z triton_convolution2d_0 0.0973 ms 85.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:36.4576979Z triton_convolution2d_3 0.1034 ms 80.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.4579862Z triton_convolution2d_4 0.1147 ms 72.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.4581490Z convolution 0.1464 ms 56.6% 2025-09-07T13:55:36.4583062Z triton_convolution2d_2 0.1802 ms 46.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:36.4585183Z SingleProcess AUTOTUNE benchmarking takes 0.1885 seconds and 0.0003 seconds precompiling for 7 choices 2025-09-07T13:55:36.7552076Z Autotune Choices Stats: 2025-09-07T13:55:36.7555524Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.3481599986553192, "best_triton_pos": 1, "best_triton_time": 0.4249599874019623, "best_triton_kernel": "triton_convolution2d_7", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:36.7706091Z AUTOTUNE convolution(32x64x112x112, 64x64x3x3) 2025-09-07T13:55:36.7706804Z strides: [802816, 12544, 112, 1], [576, 9, 3, 1] 2025-09-07T13:55:36.7707425Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:36.7707997Z convolution 0.3482 ms 100.0% 2025-09-07T13:55:36.7709648Z triton_convolution2d_7 0.4250 ms 81.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:36.7712502Z triton_convolution2d_12 0.4250 ms 81.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.7715324Z triton_convolution2d_9 0.4465 ms 78.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.7718120Z triton_convolution2d_10 0.5356 ms 65.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:36.7720911Z triton_convolution2d_11 0.5519 ms 63.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:36.7724230Z triton_convolution2d_6 0.5888 ms 59.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:36.7727034Z triton_convolution2d_8 1.3435 ms 25.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:36.7729246Z SingleProcess AUTOTUNE benchmarking takes 0.3135 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:37.0718846Z Autotune Choices Stats: 2025-09-07T13:55:37.0721993Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.23756800591945648, "best_triton_pos": 1, "best_triton_time": 0.2754560112953186, "best_triton_kernel": "triton_convolution2d_14", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:37.0873238Z AUTOTUNE convolution(32x64x112x112, 128x64x3x3) 2025-09-07T13:55:37.0873976Z strides: [802816, 12544, 112, 1], [576, 9, 3, 1] 2025-09-07T13:55:37.0874614Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:37.0875163Z convolution 0.2376 ms 100.0% 2025-09-07T13:55:37.0877069Z triton_convolution2d_14 0.2755 ms 86.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.0879901Z triton_convolution2d_19 0.2755 ms 86.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.0882728Z triton_convolution2d_13 0.2888 ms 82.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.0885638Z triton_convolution2d_16 0.3205 ms 74.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.0888595Z triton_convolution2d_17 0.3379 ms 70.3% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.0891283Z triton_convolution2d_18 0.3840 ms 61.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.0893956Z triton_convolution2d_15 1.6005 ms 14.8% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:37.0896078Z SingleProcess AUTOTUNE benchmarking takes 0.3155 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:37.3922756Z Autotune Choices Stats: 2025-09-07T13:55:37.3925960Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.23552000522613525, "best_triton_pos": 1, "best_triton_time": 0.45977601408958435, "best_triton_kernel": "triton_convolution2d_21", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:37.4078956Z AUTOTUNE convolution(32x128x56x56, 128x128x3x3) 2025-09-07T13:55:37.4079691Z strides: [401408, 3136, 56, 1], [1152, 9, 3, 1] 2025-09-07T13:55:37.4080348Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:37.4080902Z convolution 0.2355 ms 100.0% 2025-09-07T13:55:37.4082986Z triton_convolution2d_21 0.4598 ms 51.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.4085737Z triton_convolution2d_26 0.5335 ms 44.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.4088574Z triton_convolution2d_20 0.5663 ms 41.6% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.4091404Z triton_convolution2d_23 0.6830 ms 34.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.4094439Z triton_convolution2d_25 0.8049 ms 29.3% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:37.4097366Z triton_convolution2d_24 0.9236 ms 25.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:37.4100101Z triton_convolution2d_22 1.4377 ms 16.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:37.4102337Z SingleProcess AUTOTUNE benchmarking takes 0.3193 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:37.7640128Z Autotune Choices Stats: 2025-09-07T13:55:37.7643518Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.19763199985027313, "best_triton_pos": 1, "best_triton_time": 0.3696640133857727, "best_triton_kernel": "triton_convolution2d_60", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T13:55:37.7805967Z AUTOTUNE convolution(32x768x56x56, 256x768x1x1) 2025-09-07T13:55:37.7806698Z strides: [2408448, 3136, 56, 1], [768, 1, 1, 1] 2025-09-07T13:55:37.7807595Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:37.7808155Z convolution 0.1976 ms 100.0% 2025-09-07T13:55:37.7809715Z triton_convolution2d_60 0.3697 ms 53.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:37.7812418Z triton_convolution2d_58 0.3717 ms 53.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:37.7815104Z triton_convolution2d_55 0.4588 ms 43.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:37.7817885Z triton_convolution2d_61 0.5202 ms 38.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:37.7820598Z triton_convolution2d_59 0.5550 ms 35.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:37.7823762Z triton_convolution2d_56 0.5663 ms 34.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:37.7827189Z triton_convolution2d_57 0.9544 ms 20.7% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:37.7828896Z conv1x1_via_mm 1.4520 ms 13.6% 2025-09-07T13:55:37.7829947Z SingleProcess AUTOTUNE benchmarking takes 0.3524 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:38.0684198Z Autotune Choices Stats: 2025-09-07T13:55:38.0687343Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.14847999811172485, "best_triton_pos": 1, "best_triton_time": 0.38809600472450256, "best_triton_kernel": "triton_convolution2d_63", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:38.0840297Z AUTOTUNE convolution(32x256x28x28, 160x256x3x3) 2025-09-07T13:55:38.0841025Z strides: [200704, 784, 28, 1], [2304, 9, 3, 1] 2025-09-07T13:55:38.0841701Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:38.0842333Z convolution 0.1485 ms 100.0% 2025-09-07T13:55:38.0844781Z triton_convolution2d_63 0.3881 ms 38.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.0848133Z triton_convolution2d_68 0.4168 ms 35.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.0850937Z triton_convolution2d_66 0.4321 ms 34.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.0853737Z triton_convolution2d_62 0.5396 ms 27.5% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.0856574Z triton_convolution2d_65 0.5806 ms 25.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.0859574Z triton_convolution2d_64 0.7803 ms 19.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:38.0862417Z triton_convolution2d_67 0.8366 ms 17.7% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.0864636Z SingleProcess AUTOTUNE benchmarking takes 0.3022 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:38.3714911Z Autotune Choices Stats: 2025-09-07T13:55:38.3718094Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.10342399775981903, "best_triton_pos": 1, "best_triton_time": 0.24780799448490143, "best_triton_kernel": "triton_convolution2d_70", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:38.3875355Z AUTOTUNE convolution(32x160x28x28, 160x160x3x3) 2025-09-07T13:55:38.3876087Z strides: [125440, 784, 28, 1], [1440, 9, 3, 1] 2025-09-07T13:55:38.3876714Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:38.3877286Z convolution 0.1034 ms 100.0% 2025-09-07T13:55:38.3878950Z triton_convolution2d_70 0.2478 ms 41.7% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.3882099Z triton_convolution2d_75 0.2478 ms 41.7% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.3885219Z triton_convolution2d_73 0.2857 ms 36.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.3888038Z triton_convolution2d_69 0.3400 ms 30.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:38.3890868Z triton_convolution2d_72 0.3461 ms 29.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.3893871Z triton_convolution2d_71 0.4895 ms 21.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:38.3896817Z triton_convolution2d_74 0.5714 ms 18.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:38.3899036Z SingleProcess AUTOTUNE benchmarking takes 0.3022 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:38.7267616Z Autotune Choices Stats: 2025-09-07T13:55:38.7269878Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.14028799533843994, "best_triton_pos": 1, "best_triton_time": 0.26419198513031006, "best_triton_kernel": "triton_convolution2d_100", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T13:55:38.7426243Z AUTOTUNE convolution(32x1056x28x28, 512x1056x1x1) 2025-09-07T13:55:38.7426974Z strides: [827904, 784, 28, 1], [1056, 1, 1, 1] 2025-09-07T13:55:38.7427597Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:38.7428176Z convolution 0.1403 ms 100.0% 2025-09-07T13:55:38.7430006Z triton_convolution2d_100 0.2642 ms 53.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:38.7432492Z triton_convolution2d_102 0.2642 ms 53.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:38.7434971Z triton_convolution2d_103 0.2826 ms 49.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:38.7437422Z triton_convolution2d_101 0.2990 ms 46.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:38.7439871Z triton_convolution2d_97 0.3144 ms 44.6% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:38.7442301Z triton_convolution2d_98 0.3195 ms 43.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:38.7444169Z conv1x1_via_mm 0.7055 ms 19.9% 2025-09-07T13:55:38.7445807Z triton_convolution2d_99 0.7936 ms 17.7% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:38.7447709Z SingleProcess AUTOTUNE benchmarking takes 0.3390 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:39.0261061Z Autotune Choices Stats: 2025-09-07T13:55:39.0263837Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.1013759970664978, "best_triton_pos": 1, "best_triton_time": 0.2764799892902374, "best_triton_kernel": "triton_convolution2d_110", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T13:55:39.0417805Z AUTOTUNE convolution(32x512x14x14, 192x512x3x3) 2025-09-07T13:55:39.0418496Z strides: [100352, 196, 14, 1], [4608, 9, 3, 1] 2025-09-07T13:55:39.0419144Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:39.0419732Z convolution 0.1014 ms 100.0% 2025-09-07T13:55:39.0421260Z triton_convolution2d_110 0.2765 ms 36.7% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.0424057Z triton_convolution2d_107 0.3369 ms 30.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.0426581Z triton_convolution2d_108 0.3502 ms 28.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.0429034Z triton_convolution2d_105 0.3512 ms 28.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.0431474Z triton_convolution2d_109 0.4946 ms 20.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.0434180Z triton_convolution2d_106 0.5110 ms 19.8% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:39.0436667Z triton_convolution2d_104 0.5407 ms 18.7% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.0438624Z SingleProcess AUTOTUNE benchmarking takes 0.2978 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:39.2622443Z Autotune Choices Stats: 2025-09-07T13:55:39.2624927Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.053247999399900436, "best_triton_pos": 1, "best_triton_time": 0.11059200018644333, "best_triton_kernel": "triton_convolution2d_117", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T13:55:39.2784956Z AUTOTUNE convolution(32x192x14x14, 192x192x3x3) 2025-09-07T13:55:39.2785542Z strides: [37632, 196, 14, 1], [1728, 9, 3, 1] 2025-09-07T13:55:39.2786063Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:39.2786541Z convolution 0.0532 ms 100.0% 2025-09-07T13:55:39.2787891Z triton_convolution2d_117 0.1106 ms 48.1% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.2790400Z triton_convolution2d_115 0.1352 ms 39.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.2792622Z triton_convolution2d_114 0.1372 ms 38.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.2794837Z triton_convolution2d_112 0.1393 ms 38.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.2797028Z triton_convolution2d_113 0.1905 ms 28.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:39.2799428Z triton_convolution2d_116 0.1956 ms 27.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.2801764Z triton_convolution2d_111 0.2161 ms 24.6% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.2803817Z SingleProcess AUTOTUNE benchmarking takes 0.2354 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:39.5799154Z Autotune Choices Stats: 2025-09-07T13:55:39.5802306Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.11776000261306763, "best_triton_pos": 1, "best_triton_time": 0.15360000729560852, "best_triton_kernel": "triton_convolution2d_142", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T13:55:39.5961734Z AUTOTUNE convolution(32x1472x14x14, 768x1472x1x1) 2025-09-07T13:55:39.5962459Z strides: [288512, 196, 14, 1], [1472, 1, 1, 1] 2025-09-07T13:55:39.5963202Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:39.5963774Z convolution 0.1178 ms 100.0% 2025-09-07T13:55:39.5965568Z triton_convolution2d_142 0.1536 ms 76.7% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:39.5967842Z triton_convolution2d_143 0.1710 ms 68.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:39.5970451Z triton_convolution2d_144 0.1915 ms 61.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:39.5973094Z triton_convolution2d_140 0.1997 ms 59.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:39.5975700Z triton_convolution2d_139 0.2007 ms 58.7% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:39.5978394Z triton_convolution2d_145 0.2140 ms 55.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:39.5979993Z conv1x1_via_mm 0.4198 ms 28.0% 2025-09-07T13:55:39.5981571Z triton_convolution2d_141 0.4700 ms 25.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:39.5983858Z SingleProcess AUTOTUNE benchmarking takes 0.3015 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:39.8856700Z Autotune Choices Stats: 2025-09-07T13:55:39.8859367Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.13414399325847626, "best_triton_pos": 1, "best_triton_time": 0.4126720130443573, "best_triton_kernel": "triton_convolution2d_152", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T13:55:39.9016008Z AUTOTUNE convolution(32x768x14x14, 192x768x3x3) 2025-09-07T13:55:39.9016760Z strides: [150528, 196, 14, 1], [6912, 9, 3, 1] 2025-09-07T13:55:39.9017804Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:39.9018418Z convolution 0.1341 ms 100.0% 2025-09-07T13:55:39.9020091Z triton_convolution2d_152 0.4127 ms 32.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.9023026Z triton_convolution2d_150 0.5069 ms 26.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.9025760Z triton_convolution2d_147 0.5243 ms 25.6% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.9028533Z triton_convolution2d_149 0.5407 ms 24.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.9031252Z triton_convolution2d_148 0.7363 ms 18.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:39.9033987Z triton_convolution2d_151 0.7649 ms 17.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:39.9036991Z triton_convolution2d_146 0.8407 ms 16.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:39.9039160Z SingleProcess AUTOTUNE benchmarking takes 0.3040 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T13:55:40.2314784Z Autotune Choices Stats: 2025-09-07T13:55:40.2317333Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.13209599256515503, "best_triton_pos": 1, "best_triton_time": 0.17715199291706085, "best_triton_kernel": "triton_convolution2d_184", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T13:55:40.2475471Z AUTOTUNE convolution(32x1728x14x14, 768x1728x1x1) 2025-09-07T13:55:40.2476256Z strides: [338688, 196, 14, 1], [1728, 1, 1, 1] 2025-09-07T13:55:40.2476903Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:40.2477501Z convolution 0.1321 ms 100.0% 2025-09-07T13:55:40.2479136Z triton_convolution2d_184 0.1772 ms 74.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:40.2481899Z triton_convolution2d_185 0.2038 ms 64.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:40.2485295Z triton_convolution2d_186 0.2284 ms 57.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:40.2488045Z triton_convolution2d_182 0.2314 ms 57.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:40.2490772Z triton_convolution2d_181 0.2396 ms 55.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:40.2493717Z triton_convolution2d_187 0.2580 ms 51.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:40.2495397Z conv1x1_via_mm 0.4803 ms 27.5% 2025-09-07T13:55:40.2497036Z triton_convolution2d_183 0.5632 ms 23.5% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:40.2499451Z SingleProcess AUTOTUNE benchmarking takes 0.3247 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:40.5321771Z Autotune Choices Stats: 2025-09-07T13:55:40.5325061Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06963200122117996, "best_triton_pos": 1, "best_triton_time": 0.28467199206352234, "best_triton_kernel": "triton_convolution2d_192", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:40.5481226Z AUTOTUNE convolution(32x768x7x7, 224x768x3x3) 2025-09-07T13:55:40.5481876Z strides: [37632, 49, 7, 1], [6912, 9, 3, 1] 2025-09-07T13:55:40.5482456Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:40.5483168Z convolution 0.0696 ms 100.0% 2025-09-07T13:55:40.5485076Z triton_convolution2d_192 0.2847 ms 24.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.5487870Z triton_convolution2d_194 0.3953 ms 17.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.5490671Z triton_convolution2d_191 0.4444 ms 15.7% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.5493593Z triton_convolution2d_189 0.4710 ms 14.8% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.5496466Z triton_convolution2d_193 0.6083 ms 11.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.5499450Z triton_convolution2d_190 0.8253 ms 8.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:40.5502326Z triton_convolution2d_188 0.8530 ms 8.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.5504698Z SingleProcess AUTOTUNE benchmarking takes 0.2992 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:40.7727272Z Autotune Choices Stats: 2025-09-07T13:55:40.7730256Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03686400130391121, "best_triton_pos": 1, "best_triton_time": 0.08499199897050858, "best_triton_kernel": "triton_convolution2d_199", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:40.7882797Z AUTOTUNE convolution(32x224x7x7, 224x224x3x3) 2025-09-07T13:55:40.7883647Z strides: [10976, 49, 7, 1], [2016, 9, 3, 1] 2025-09-07T13:55:40.7884244Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:40.7884789Z convolution 0.0369 ms 100.0% 2025-09-07T13:55:40.7886595Z triton_convolution2d_199 0.0850 ms 43.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.7889396Z triton_convolution2d_201 0.1229 ms 30.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.7892362Z triton_convolution2d_198 0.1341 ms 27.5% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.7895070Z triton_convolution2d_196 0.1526 ms 24.2% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.7897864Z triton_convolution2d_200 0.1823 ms 20.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:40.7900526Z triton_convolution2d_197 0.2447 ms 15.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:40.7903514Z triton_convolution2d_195 0.2632 ms 14.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:40.7905720Z SingleProcess AUTOTUNE benchmarking takes 0.2388 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:41.0346037Z Autotune Choices Stats: 2025-09-07T13:55:41.0349075Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.05427199974656105, "best_triton_pos": 1, "best_triton_time": 0.08499199897050858, "best_triton_kernel": "triton_convolution2d_227", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T13:55:41.0505080Z AUTOTUNE convolution(32x1888x7x7, 1024x1888x1x1) 2025-09-07T13:55:41.0505719Z strides: [92512, 49, 7, 1], [1888, 1, 1, 1] 2025-09-07T13:55:41.0506270Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:41.0506796Z convolution 0.0543 ms 100.0% 2025-09-07T13:55:41.0508363Z triton_convolution2d_227 0.0850 ms 63.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.0511031Z triton_convolution2d_228 0.0870 ms 62.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.0513963Z triton_convolution2d_226 0.0942 ms 57.6% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.0516648Z triton_convolution2d_223 0.1219 ms 44.5% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.0519449Z triton_convolution2d_229 0.1413 ms 38.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.0522212Z triton_convolution2d_224 0.1700 ms 31.9% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.0524393Z conv1x1_via_mm 0.2724 ms 19.9% 2025-09-07T13:55:41.0526040Z triton_convolution2d_225 0.4393 ms 12.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:41.0528252Z SingleProcess AUTOTUNE benchmarking takes 0.2459 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:41.3362465Z Autotune Choices Stats: 2025-09-07T13:55:41.3365802Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.08499199897050858, "best_triton_pos": 1, "best_triton_time": 0.38707199692726135, "best_triton_kernel": "triton_convolution2d_234", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T13:55:41.3519552Z AUTOTUNE convolution(32x1024x7x7, 224x1024x3x3) 2025-09-07T13:55:41.3520201Z strides: [50176, 49, 7, 1], [9216, 9, 3, 1] 2025-09-07T13:55:41.3520770Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:41.3521303Z convolution 0.0850 ms 100.0% 2025-09-07T13:55:41.3523046Z triton_convolution2d_234 0.3871 ms 22.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:41.3526014Z triton_convolution2d_236 0.5243 ms 16.2% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:41.3528620Z triton_convolution2d_233 0.5734 ms 14.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:41.3531275Z triton_convolution2d_231 0.6410 ms 13.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:41.3533932Z triton_convolution2d_235 0.7772 ms 10.9% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T13:55:41.3536708Z triton_convolution2d_232 1.0957 ms 7.8% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T13:55:41.3539468Z triton_convolution2d_230 1.1561 ms 7.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T13:55:41.3541528Z SingleProcess AUTOTUNE benchmarking takes 0.3002 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T13:55:41.6118167Z Autotune Choices Stats: 2025-09-07T13:55:41.6121130Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.05939200147986412, "best_triton_pos": 1, "best_triton_time": 0.0942080020904541, "best_triton_kernel": "triton_convolution2d_269", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T13:55:41.6272548Z AUTOTUNE convolution(32x2144x7x7, 1024x2144x1x1) 2025-09-07T13:55:41.6273241Z strides: [105056, 49, 7, 1], [2144, 1, 1, 1] 2025-09-07T13:55:41.6273794Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:41.6274343Z convolution 0.0594 ms 100.0% 2025-09-07T13:55:41.6275919Z triton_convolution2d_269 0.0942 ms 63.0% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.6278986Z triton_convolution2d_270 0.0983 ms 60.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.6282047Z triton_convolution2d_268 0.1055 ms 56.3% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.6285018Z triton_convolution2d_265 0.1382 ms 43.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.6287794Z triton_convolution2d_271 0.1587 ms 37.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T13:55:41.6290551Z triton_convolution2d_266 0.1894 ms 31.4% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T13:55:41.6292230Z conv1x1_via_mm 0.3052 ms 19.5% 2025-09-07T13:55:41.6294073Z triton_convolution2d_267 0.4966 ms 12.0% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T13:55:41.6296262Z SingleProcess AUTOTUNE benchmarking takes 0.2541 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T13:55:41.8896419Z Autotune Choices Stats: 2025-09-07T13:55:41.8898580Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_276", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T13:55:41.9057442Z AUTOTUNE addmm(32x1000, 32x1024, 1024x1000) 2025-09-07T13:55:41.9058064Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T13:55:41.9058658Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T13:55:41.9060123Z triton_mm_276 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:55:41.9061424Z bias_addmm 0.0123 ms 91.7% 2025-09-07T13:55:41.9062704Z triton_mm_280 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:55:41.9064704Z triton_mm_275 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:55:41.9066832Z triton_mm_284 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T13:55:41.9069377Z triton_mm_273 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T13:55:41.9071581Z triton_mm_274 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:55:41.9073712Z triton_mm_288 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:55:41.9075887Z triton_mm_279 0.0164 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T13:55:41.9078291Z triton_mm_283 0.0164 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T13:55:41.9080175Z SingleProcess AUTOTUNE benchmarking takes 0.2771 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T13:55:59.3801178Z Autotune Choices Stats: 2025-09-07T13:55:59.3802684Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_313", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T13:55:59.3960925Z AUTOTUNE mm(1000x32, 32x1024) 2025-09-07T13:55:59.3961225Z strides: [1, 1000], [1024, 1] 2025-09-07T13:55:59.3961519Z dtypes: torch.float16, torch.float16 2025-09-07T13:55:59.3962297Z triton_mm_313 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T13:55:59.3963577Z triton_mm_306 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T13:55:59.3964750Z triton_mm_307 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T13:55:59.3966096Z triton_mm_308 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:55:59.3967254Z triton_mm_309 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:55:59.3968419Z triton_mm_310 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:55:59.3969590Z triton_mm_311 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T13:55:59.3970762Z triton_mm_312 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T13:55:59.3971928Z triton_mm_314 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:55:59.3973101Z triton_mm_315 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T13:55:59.3974126Z SingleProcess AUTOTUNE benchmarking takes 0.2323 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T13:56:00.1765612Z Autotune Choices Stats: 2025-09-07T13:56:00.1766849Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_293", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T13:56:00.1922629Z AUTOTUNE mm(32x1000, 1000x1024) 2025-09-07T13:56:00.1923063Z strides: [1000, 1], [1024, 1] 2025-09-07T13:56:00.1923371Z dtypes: torch.float16, torch.float16 2025-09-07T13:56:00.1924132Z triton_mm_293 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:56:00.1925339Z triton_mm_297 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:56:00.1926388Z mm 0.0133 ms 84.6% 2025-09-07T13:56:00.1927085Z triton_mm_301 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T13:56:00.1928381Z triton_mm_292 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T13:56:00.1929616Z triton_mm_305 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:56:00.1930795Z triton_mm_291 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T13:56:00.1931971Z triton_mm_296 0.0154 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T13:56:00.1933157Z triton_mm_290 0.0164 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T13:56:00.1934344Z triton_mm_300 0.0164 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=32, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T13:56:00.1935464Z SingleProcess AUTOTUNE benchmarking takes 0.2590 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T13:56:16.4486666Z 2025-09-07T13:56:16.5928717Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:58:50.5520890Z 2025-09-07T13:58:50.7285064Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T13:59:13.3481922Z 2025-09-07T13:59:13.4536107Z running benchmark: 0% 0/30 [00:00 2025-09-07T14:01:17.2090402Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2090807Z 2025-09-07T14:01:17.2091110Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:01:17.2091854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:01:17.2092573Z anchors = self.anchor_generator(images, features) 2025-09-07T14:01:17.2093312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:01:17.2094072Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:01:17.2094866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:01:17.2095797Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2096700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:01:17.2097665Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2098080Z 2025-09-07T14:01:17.2098261Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:01:17.2098997Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:01:17.2099707Z anchors = self.anchor_generator(images, features) 2025-09-07T14:01:17.2100457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:01:17.2101272Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:01:17.2102067Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:01:17.2103000Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2103904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:01:17.2104806Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2105208Z 2025-09-07T14:01:17.2105386Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:01:17.2106117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:01:17.2106884Z anchors = self.anchor_generator(images, features) 2025-09-07T14:01:17.2107634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:01:17.2108390Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:01:17.2109168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:01:17.2110143Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2111045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:01:17.2111948Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2112352Z 2025-09-07T14:01:17.2112546Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:01:17.2113273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:01:17.2113986Z anchors = self.anchor_generator(images, features) 2025-09-07T14:01:17.2114735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:01:17.2115492Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:01:17.2116342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:01:17.2117261Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2118165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:01:17.2119062Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:01:17.2119466Z 2025-09-07T14:01:17.2284301Z cudagraph partition into 2 partitions 2025-09-07T14:01:22.7664385Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T14:01:22.7665508Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T14:01:22.7666442Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] or: 2025-09-07T14:01:22.7667316Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T14:01:22.7668369Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] to include these operations in the captured graph. 2025-09-07T14:01:22.7669241Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T14:01:22.7670059Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] Graph break: from user code at: 2025-09-07T14:01:22.7671937Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T14:01:22.7673743Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T14:01:22.7675337Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T14:01:22.7676815Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T14:01:22.7678322Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T14:01:22.7679641Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T14:01:22.7681079Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T14:01:22.7682494Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:01:22.7684162Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T14:01:22.7685585Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:01:22.7687011Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:01:22.7688504Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:01:22.7689427Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T14:01:22.7690110Z W0907 14:01:22.765000 542757 site-packages/torch/_dynamo/variables/tensor.py:1048] [17/0] 2025-09-07T14:01:23.0613239Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:01:23.0614123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T14:01:23.0615028Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T14:01:23.0615399Z 2025-09-07T14:01:27.7454752Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:01:27.7455690Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T14:01:27.7456609Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T14:01:27.7456968Z 2025-09-07T14:01:35.3077252Z Autotune Choices Stats: 2025-09-07T14:01:35.3080612Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.14028799533843994, "best_triton_pos": 1, "best_triton_time": 0.21094399690628052, "best_triton_kernel": "triton_mm_1048", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:01:35.3322430Z AUTOTUNE mm(1000x12544, 12544x1024) 2025-09-07T14:01:35.3323497Z strides: [12544, 1], [1, 12544] 2025-09-07T14:01:35.3324066Z dtypes: torch.float16, torch.float16 2025-09-07T14:01:35.3324638Z mm 0.1403 ms 100.0% 2025-09-07T14:01:35.3326057Z triton_mm_1048 0.2109 ms 66.5% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:35.3328455Z triton_mm_1043 0.2161 ms 64.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:35.3330826Z triton_mm_1040 0.2253 ms 62.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:35.3333529Z triton_mm_1049 0.2284 ms 61.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:35.3335827Z triton_mm_1039 0.2294 ms 61.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:01:35.3338400Z triton_mm_1044 0.2509 ms 55.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:35.3340894Z triton_mm_1047 0.2570 ms 54.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:35.3343267Z triton_mm_1041 0.2642 ms 53.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:35.3345620Z triton_mm_1042 0.2662 ms 52.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:35.3347638Z SingleProcess AUTOTUNE benchmarking takes 0.7704 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:01:36.0086077Z Autotune Choices Stats: 2025-09-07T14:01:36.0087957Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.023552000522613525, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_mm_1067", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T14:01:36.0271879Z AUTOTUNE mm(1000x1024, 1024x1024) 2025-09-07T14:01:36.0272625Z strides: [1024, 1], [1, 1024] 2025-09-07T14:01:36.0273324Z dtypes: torch.float16, torch.float16 2025-09-07T14:01:36.0273999Z mm 0.0236 ms 100.0% 2025-09-07T14:01:36.0275478Z triton_mm_1067 0.0236 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:36.0277952Z triton_mm_1061 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:36.0280428Z triton_mm_1066 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:36.0282665Z triton_mm_1057 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:01:36.0285275Z triton_mm_1062 0.0276 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:36.0287813Z triton_mm_1059 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:36.0290077Z triton_mm_1065 0.0287 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:36.0292373Z triton_mm_1058 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:36.0294726Z triton_mm_1060 0.0297 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:36.0296844Z SingleProcess AUTOTUNE benchmarking takes 0.3270 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T14:01:37.3027818Z Autotune Choices Stats: 2025-09-07T14:01:37.3029052Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1072", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T14:01:37.3228929Z AUTOTUNE addmm(1000x91, 1000x1024, 1024x91) 2025-09-07T14:01:37.3229278Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T14:01:37.3229640Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T14:01:37.3230460Z triton_mm_1072 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:37.3231664Z triton_mm_1069 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:01:37.3232867Z triton_mm_1076 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:37.3234043Z triton_mm_1070 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:37.3235321Z triton_mm_1071 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:37.3236498Z triton_mm_1075 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:01:37.3237675Z triton_mm_1078 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:37.3238861Z triton_mm_1079 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:37.3240045Z triton_mm_1081 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:37.3241240Z triton_mm_1077 0.0276 ms 48.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:37.3242265Z SingleProcess AUTOTUNE benchmarking takes 0.3061 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:01:37.6081689Z Autotune Choices Stats: 2025-09-07T14:01:37.6083021Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1094", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T14:01:37.6276152Z AUTOTUNE addmm(1000x364, 1000x1024, 1024x364) 2025-09-07T14:01:37.6276520Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T14:01:37.6276882Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T14:01:37.6277719Z triton_mm_1094 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:37.6278898Z triton_mm_1093 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:01:37.6280077Z triton_mm_1090 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:37.6281409Z triton_mm_1097 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:37.6282595Z triton_mm_1087 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:01:37.6283958Z triton_mm_1088 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:37.6285136Z triton_mm_1089 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:37.6286315Z triton_mm_1096 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:37.6287492Z triton_mm_1099 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:37.6288688Z triton_mm_1098 0.0256 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:37.6289796Z SingleProcess AUTOTUNE benchmarking takes 0.3041 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:01:58.2306757Z Autotune Choices Stats: 2025-09-07T14:01:58.2470525Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.06758400052785873, "best_triton_kernel": "triton_convolution2d_1108", "best_triton_kernel_desc": "ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:01:58.2534970Z AUTOTUNE convolution(4x256x14x14, 256x256x3x3) 2025-09-07T14:01:58.2535638Z strides: [50176, 1, 3584, 256], [2304, 1, 768, 256] 2025-09-07T14:01:58.2766474Z dtypes: torch.float16, torch.float16 2025-09-07T14:01:58.2766883Z convolution 0.0195 ms 100.0% 2025-09-07T14:01:58.2767809Z triton_convolution2d_1108 0.0676 ms 28.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:01:58.2769305Z triton_convolution2d_1107 0.0983 ms 19.8% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:01:58.2770812Z triton_convolution2d_1109 0.1055 ms 18.4% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:01:58.2772604Z triton_convolution2d_1110 0.1065 ms 18.3% ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:01:58.2774106Z triton_convolution2d_1105 0.1270 ms 15.3% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:01:58.2775604Z triton_convolution2d_1104 0.1341 ms 14.5% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:01:58.2777104Z triton_convolution2d_1106 0.2140 ms 9.1% ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:01:58.2778508Z SingleProcess AUTOTUNE benchmarking takes 0.2277 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:01:59.6944702Z Autotune Choices Stats: 2025-09-07T14:01:59.6946255Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_1135", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T14:01:59.7155375Z AUTOTUNE addmm(3136x91, 3136x256, 256x91) 2025-09-07T14:01:59.7155753Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:01:59.7156115Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T14:01:59.7156926Z triton_mm_1135 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:59.7158160Z triton_mm_1139 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:01:59.7159356Z triton_mm_1133 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:01:59.7160800Z triton_mm_1140 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:59.7162002Z triton_mm_1142 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:59.7163444Z triton_mm_1145 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:01:59.7164636Z triton_mm_1134 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:01:59.7165823Z triton_mm_1136 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:01:59.7167005Z triton_mm_1143 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:01:59.7168175Z triton_mm_1138 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:01:59.7169202Z SingleProcess AUTOTUNE benchmarking takes 0.2921 seconds and 1.1524 seconds precompiling for 20 choices 2025-09-07T14:02:20.5980563Z Autotune Choices Stats: 2025-09-07T14:02:20.5982477Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.11468800157308578, "best_triton_pos": 1, "best_triton_time": 0.1454080045223236, "best_triton_kernel": "triton_mm_1292", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:02:20.6185499Z AUTOTUNE mm(1024x1000, 1000x12544) 2025-09-07T14:02:20.6185834Z strides: [1, 1024], [12544, 1] 2025-09-07T14:02:20.6186142Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:20.6186455Z mm 0.1147 ms 100.0% 2025-09-07T14:02:20.6187177Z triton_mm_1292 0.1454 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:20.6188545Z triton_mm_1291 0.1546 ms 74.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:20.6189737Z triton_mm_1288 0.1587 ms 72.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:20.6191011Z triton_mm_1287 0.1659 ms 69.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:20.6192199Z triton_mm_1285 0.1669 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:20.6193398Z triton_mm_1293 0.1812 ms 63.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:20.6194609Z triton_mm_1289 0.1997 ms 57.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:20.6195803Z triton_mm_1286 0.2028 ms 56.6% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:20.6197114Z triton_mm_1290 0.2038 ms 56.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:02:20.6198142Z SingleProcess AUTOTUNE benchmarking takes 0.6402 seconds and 4.3714 seconds precompiling for 19 choices 2025-09-07T14:02:21.1374133Z Autotune Choices Stats: 2025-09-07T14:02:21.1375639Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.021503999829292297, "best_triton_pos": 1, "best_triton_time": 0.025599999353289604, "best_triton_kernel": "triton_mm_1251", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:02:21.1568513Z AUTOTUNE mm(1024x1000, 1000x1024) 2025-09-07T14:02:21.1568872Z strides: [1, 1024], [1024, 1] 2025-09-07T14:02:21.1569193Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:21.1569492Z mm 0.0215 ms 100.0% 2025-09-07T14:02:21.1570223Z triton_mm_1251 0.0256 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.1571431Z triton_mm_1256 0.0256 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.1572630Z triton_mm_1247 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:21.1573833Z triton_mm_1257 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:21.1575284Z triton_mm_1252 0.0276 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.1576471Z triton_mm_1248 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:21.1577733Z triton_mm_1249 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.1578940Z triton_mm_1255 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.1580247Z triton_mm_1250 0.0317 ms 67.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:21.1581283Z SingleProcess AUTOTUNE benchmarking takes 0.3016 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:02:21.4324169Z Autotune Choices Stats: 2025-09-07T14:02:21.4325588Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1210", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T14:02:21.4521853Z AUTOTUNE mm(1000x91, 91x1024) 2025-09-07T14:02:21.4522158Z strides: [91, 1], [1024, 1] 2025-09-07T14:02:21.4522455Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:21.4523360Z triton_mm_1210 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:21.4524587Z triton_mm_1212 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:21.4525975Z triton_mm_1213 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.4527201Z triton_mm_1214 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:21.4528415Z triton_mm_1216 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.4529637Z triton_mm_1218 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:02:21.4530862Z triton_mm_1219 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:21.4532068Z triton_mm_1206 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:21.4533255Z triton_mm_1209 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:21.4534437Z triton_mm_1211 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:21.4535453Z SingleProcess AUTOTUNE benchmarking takes 0.2712 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:02:22.4907780Z Autotune Choices Stats: 2025-09-07T14:02:22.4909353Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.021503999829292297, "best_triton_pos": 1, "best_triton_time": 0.023552000522613525, "best_triton_kernel": "triton_mm_1233", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:02:22.5103437Z AUTOTUNE mm(1000x1024, 1024x1024) 2025-09-07T14:02:22.5103759Z strides: [1024, 1], [1024, 1] 2025-09-07T14:02:22.5104062Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:22.5104359Z mm 0.0215 ms 100.0% 2025-09-07T14:02:22.5105068Z triton_mm_1233 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:22.5106524Z triton_mm_1239 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:22.5107723Z triton_mm_1231 0.0256 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:22.5108996Z triton_mm_1229 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:22.5110185Z triton_mm_1238 0.0266 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:22.5111385Z triton_mm_1237 0.0276 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:22.5112565Z triton_mm_1232 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:22.5113757Z triton_mm_1235 0.0287 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:22.5115029Z triton_mm_1230 0.0297 ms 72.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:22.5116059Z SingleProcess AUTOTUNE benchmarking takes 0.3056 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:02:23.0117107Z Autotune Choices Stats: 2025-09-07T14:02:23.0118334Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1157", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T14:02:23.0316592Z AUTOTUNE mm(364x1000, 1000x1024) 2025-09-07T14:02:23.0316908Z strides: [1, 364], [1024, 1] 2025-09-07T14:02:23.0317202Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:23.0317968Z triton_mm_1157 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:23.0319167Z triton_mm_1152 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:23.0320366Z triton_mm_1158 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:23.0321566Z triton_mm_1161 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:23.0322569Z mm 0.0215 ms 81.0% 2025-09-07T14:02:23.0323470Z triton_mm_1160 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:23.0324657Z triton_mm_1153 0.0236 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:23.0325860Z triton_mm_1163 0.0236 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:23.0327163Z triton_mm_1159 0.0246 ms 70.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:23.0328359Z triton_mm_1151 0.0266 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:23.0329397Z SingleProcess AUTOTUNE benchmarking takes 0.2884 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:02:23.5360440Z Autotune Choices Stats: 2025-09-07T14:02:23.5362227Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.020479999482631683, "best_triton_pos": 1, "best_triton_time": 0.02457600086927414, "best_triton_kernel": "triton_mm_1169", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4"} 2025-09-07T14:02:23.5561633Z AUTOTUNE mm(91x1000, 1000x1024) 2025-09-07T14:02:23.5561946Z strides: [1, 91], [1024, 1] 2025-09-07T14:02:23.5562245Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:23.5562551Z mm 0.0205 ms 100.0% 2025-09-07T14:02:23.5563362Z triton_mm_1169 0.0246 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:23.5564572Z triton_mm_1170 0.0276 ms 74.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:23.5565916Z triton_mm_1172 0.0276 ms 74.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:23.5567117Z triton_mm_1175 0.0297 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:23.5568312Z triton_mm_1179 0.0328 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:23.5569496Z triton_mm_1171 0.0358 ms 57.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:23.5570684Z triton_mm_1174 0.0379 ms 54.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:23.5571872Z triton_mm_1176 0.0399 ms 51.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:02:23.5573064Z triton_mm_1177 0.0420 ms 48.8% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:23.5574097Z SingleProcess AUTOTUNE benchmarking takes 0.3124 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:02:24.3856754Z Autotune Choices Stats: 2025-09-07T14:02:24.3858069Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1195", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T14:02:24.4057346Z AUTOTUNE mm(1000x364, 364x1024) 2025-09-07T14:02:24.4057717Z strides: [364, 1], [1024, 1] 2025-09-07T14:02:24.4058019Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:24.4058791Z triton_mm_1195 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:24.4060014Z triton_mm_1197 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:24.4066423Z triton_mm_1201 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:24.4067789Z triton_mm_1196 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:24.4068997Z triton_mm_1192 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:02:24.4070200Z triton_mm_1198 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:24.4071394Z triton_mm_1193 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:02:24.4072576Z triton_mm_1199 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:24.4073780Z triton_mm_1203 0.0195 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:24.4074637Z mm 0.0205 ms 80.0% 2025-09-07T14:02:24.4075182Z SingleProcess AUTOTUNE benchmarking takes 0.2720 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:02:25.0121326Z Autotune Choices Stats: 2025-09-07T14:02:25.0122793Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.12185599654912949, "best_triton_pos": 1, "best_triton_time": 0.1382399946451187, "best_triton_kernel": "triton_mm_1274", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:02:25.0320661Z AUTOTUNE mm(1000x1024, 1024x12544) 2025-09-07T14:02:25.0320997Z strides: [1024, 1], [12544, 1] 2025-09-07T14:02:25.0321302Z dtypes: torch.float16, torch.float16 2025-09-07T14:02:25.0321601Z mm 0.1219 ms 100.0% 2025-09-07T14:02:25.0322336Z triton_mm_1274 0.1382 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:25.0323656Z triton_mm_1273 0.1485 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:25.0324848Z triton_mm_1267 0.1505 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:25.0326243Z triton_mm_1269 0.1597 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:25.0327435Z triton_mm_1270 0.1669 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:02:25.0328615Z triton_mm_1268 0.1741 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:25.0329815Z triton_mm_1275 0.1741 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:02:25.0331013Z triton_mm_1272 0.2017 ms 60.4% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:02:25.0332299Z triton_mm_1271 0.2028 ms 60.1% ACC_TYPE='tl.float32', ALLOW_TF32=True, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:02:25.0333328Z SingleProcess AUTOTUNE benchmarking takes 0.6255 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:02:34.5644967Z W0907 14:02:34.563000 542757 site-packages/torch/_logging/_internal.py:1199] [50/0] Profiler function will be ignored 2025-09-07T14:02:48.6086329Z W0907 14:02:48.607000 542757 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] torch._dynamo hit config.recompile_limit (8) 2025-09-07T14:02:48.6087637Z W0907 14:02:48.607000 542757 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] function: 'roi_align' (/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/roi_align.py:203) 2025-09-07T14:02:48.6089043Z W0907 14:02:48.607000 542757 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] last reason: 29/7: tensor 'boxes' requires_grad mismatch. expected requires_grad=1 2025-09-07T14:02:48.6090236Z W0907 14:02:48.607000 542757 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To log all recompilation reasons, use TORCH_LOGS="recompiles". 2025-09-07T14:02:48.6091952Z W0907 14:02:48.607000 542757 site-packages/torch/_dynamo/convert_frame.py:1358] [29/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html 2025-09-07T14:03:04.6582506Z 2025-09-07T14:03:04.7665172Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T14:05:47.7943688Z 2025-09-07T14:05:47.9441075Z running benchmark: 0% 0/30 [00:00 2025-09-07T14:13:10.4676643Z W0907 14:13:10.464000 577382 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:13:10.4678069Z W0907 14:13:10.464000 577382 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:13:10.4679446Z W0907 14:13:10.464000 577382 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:13:10.4680461Z W0907 14:13:10.464000 577382 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:13:10.4681141Z W0907 14:13:10.464000 577382 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:13:19.2478666Z pass 2025-09-07T14:13:22.8242191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:13:22.8243958Z import pynvml # type: ignore[import] 2025-09-07T14:13:25.6828112Z 2025-09-07T14:13:27.9573980Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:13:27.9574387Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:13:27.9667559Z cuda eval yolov3 2025-09-07T14:13:54.9310577Z pass 2025-09-07T14:13:57.7919107Z accuracy pass_rate=88.89% 2025-09-07T14:13:57.7924207Z calls_captured gmean=0.00x mean=407.778x 2025-09-07T14:13:57.7928737Z unique_graphs gmean=0.00x mean=2.444x 2025-09-07T14:13:57.7932870Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:13:57.7937586Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:13:57.7942077Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:13:57.7946140Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:13:57.7950328Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T14:13:57.7951973Z compilation_latency mean=15.102 seconds 2025-09-07T14:13:58.6749768Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs-true* ]] 2025-09-07T14:13:58.6752395Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:13:59.3041065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:13:59.3042552Z import pynvml # type: ignore[import] 2025-09-07T14:14:03.1286954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:14:03.1288359Z import pynvml # type: ignore[import] 2025-09-07T14:14:06.2928852Z 2025-09-07T14:14:07.2605573Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:14:07.2606669Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:14:07.2607687Z cuda eval soft_actor_critic 2025-09-07T14:14:11.6247879Z pass 2025-09-07T14:14:14.7368882Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:14:14.7370298Z import pynvml # type: ignore[import] 2025-09-07T14:14:17.5340065Z 2025-09-07T14:14:19.3485703Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:14:19.3486080Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:14:19.3551746Z cuda eval speech_transformer 2025-09-07T14:14:28.3233049Z W0907 14:14:28.322000 578990 site-packages/torch/_inductor/utils.py:2298] [7/0_1] DeviceCopy in input program 2025-09-07T14:14:29.5815107Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5815536Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5815903Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5816242Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5816580Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5816905Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5817240Z cudagraph partition due to non gpu ops 2025-09-07T14:14:29.5817633Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:14:29.6127019Z cudagraph partition into 2 partitions 2025-09-07T14:14:44.8250387Z W0907 14:14:44.824000 578990 site-packages/torch/_inductor/utils.py:2298] [13/0_1] DeviceCopy in input program 2025-09-07T14:14:46.4886685Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4887088Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4887440Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4887785Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4888143Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4888486Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4888839Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4889175Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4889516Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4889844Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4890183Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4890518Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4890863Z cudagraph partition due to non gpu ops 2025-09-07T14:14:46.4891542Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:14:46.5417310Z cudagraph partition into 2 partitions 2025-09-07T14:14:48.4422218Z pass 2025-09-07T14:14:51.6437271Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:14:51.6438724Z import pynvml # type: ignore[import] 2025-09-07T14:14:54.4447170Z 2025-09-07T14:14:55.4433388Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:14:55.4433783Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:14:55.4450933Z cuda eval squeezenet1_1 2025-09-07T14:15:02.4137132Z pass 2025-09-07T14:15:05.2470978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:15:05.2473884Z import pynvml # type: ignore[import] 2025-09-07T14:15:08.0230633Z 2025-09-07T14:15:09.8443647Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:15:09.8443924Z 2025-09-07T14:15:10.2817430Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T14:21:18.3361256Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3361661Z 2025-09-07T14:21:18.3362097Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:21:18.3363103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:21:18.3363809Z anchors = self.anchor_generator(images, features) 2025-09-07T14:21:18.3364566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:21:18.3365328Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:21:18.3366121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:21:18.3367051Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3367939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:21:18.3368836Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3369255Z 2025-09-07T14:21:18.3369436Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:21:18.3370168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:21:18.3370970Z anchors = self.anchor_generator(images, features) 2025-09-07T14:21:18.3371714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:21:18.3372481Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:21:18.3373273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:21:18.3374196Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3375105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:21:18.3375991Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3376405Z 2025-09-07T14:21:18.3376585Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:21:18.3377407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:21:18.3378220Z anchors = self.anchor_generator(images, features) 2025-09-07T14:21:18.3378967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:21:18.3379800Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:21:18.3380592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:21:18.3381524Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3382430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:21:18.3383335Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3383737Z 2025-09-07T14:21:18.3383920Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T14:21:18.3384657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T14:21:18.3385366Z anchors = self.anchor_generator(images, features) 2025-09-07T14:21:18.3386111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T14:21:18.3386863Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T14:21:18.3387656Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T14:21:18.3388653Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3389558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T14:21:18.3390462Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T14:21:18.3390869Z 2025-09-07T14:21:18.3731095Z cudagraph partition into 2 partitions 2025-09-07T14:21:19.9581023Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T14:21:19.9582160Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T14:21:19.9583063Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] or: 2025-09-07T14:21:19.9583956Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T14:21:19.9584991Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] to include these operations in the captured graph. 2025-09-07T14:21:19.9586192Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:21:19.9587015Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break: from user code at: 2025-09-07T14:21:19.9588552Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T14:21:19.9590412Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T14:21:19.9592003Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T14:21:19.9593577Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T14:21:19.9594978Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T14:21:19.9596363Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T14:21:19.9597710Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T14:21:19.9599147Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:21:19.9600578Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T14:21:19.9602008Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:21:19.9603713Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:21:19.9605211Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:21:19.9606114Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:21:33.0548873Z W0907 14:21:19.957000 581361 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:21:33.0549420Z pass 2025-09-07T14:21:36.9479270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:21:36.9480692Z import pynvml # type: ignore[import] 2025-09-07T14:21:39.6965303Z 2025-09-07T14:21:41.8014748Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:21:41.8015139Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:21:41.8109519Z cuda eval yolov3 2025-09-07T14:21:57.3316767Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:21:57.3317458Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T14:21:57.3318024Z return mod(*inputs) 2025-09-07T14:21:57.3318782Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T14:21:57.3319330Z return self.forward_once(x) 2025-09-07T14:21:57.3319854Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T14:21:57.3320406Z yolo_out.append(module(x, out)) 2025-09-07T14:21:57.3320919Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T14:21:57.3321493Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T14:21:57.3322077Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T14:21:57.3322673Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T14:21:57.3323163Z 2025-09-07T14:21:57.3323326Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:21:57.3323955Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T14:21:57.3324523Z return mod(*inputs) 2025-09-07T14:21:57.3324989Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T14:21:57.3325642Z return self.forward_once(x) 2025-09-07T14:21:57.3326167Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T14:21:57.3326822Z yolo_out.append(module(x, out)) 2025-09-07T14:21:57.3327337Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T14:21:57.3327912Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T14:21:57.3328475Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T14:21:57.3329066Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T14:21:57.3329328Z 2025-09-07T14:21:57.3329488Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:21:57.3330115Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T14:21:57.3330674Z return mod(*inputs) 2025-09-07T14:21:57.3331140Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T14:21:57.3331663Z return self.forward_once(x) 2025-09-07T14:21:57.3332171Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T14:21:57.3332729Z yolo_out.append(module(x, out)) 2025-09-07T14:21:57.3333221Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T14:21:57.3333785Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T14:21:57.3334363Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T14:21:57.3335040Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T14:21:57.3335289Z 2025-09-07T14:21:57.4467031Z cudagraph partition into 2 partitions 2025-09-07T14:22:10.6054674Z pass 2025-09-07T14:22:13.5729248Z accuracy pass_rate=88.89% 2025-09-07T14:22:13.5739062Z calls_captured gmean=0.00x mean=400.389x 2025-09-07T14:22:13.5746503Z unique_graphs gmean=0.00x mean=2.389x 2025-09-07T14:22:13.5754052Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:22:13.5761279Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:22:13.5769257Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:22:13.5776637Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:22:13.5784328Z cudagraph_skips gmean=nanx mean=-0.333x 2025-09-07T14:22:13.5786541Z compilation_latency mean=15.271 seconds 2025-09-07T14:22:14.5470400Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *dynamic-true* ]] 2025-09-07T14:22:14.5476511Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --dynamic-shapes --dynamic-batch-only --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:22:15.1592054Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:22:15.1593472Z import pynvml # type: ignore[import] 2025-09-07T14:22:19.1248176Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:22:19.1249624Z import pynvml # type: ignore[import] 2025-09-07T14:22:21.9398212Z 2025-09-07T14:22:22.9803497Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:22:22.9803874Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:22:22.9807724Z cuda eval soft_actor_critic 2025-09-07T14:22:26.6967623Z pass 2025-09-07T14:22:30.6518459Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:22:30.6521531Z import pynvml # type: ignore[import] 2025-09-07T14:22:33.6049057Z 2025-09-07T14:22:35.4143655Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:22:35.4144055Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:22:35.4210173Z cuda eval speech_transformer 2025-09-07T14:22:46.8199723Z pass 2025-09-07T14:22:49.6369946Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:22:49.6371355Z import pynvml # type: ignore[import] 2025-09-07T14:22:53.3131196Z 2025-09-07T14:22:54.6907566Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:22:54.6907988Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:22:54.6935524Z cuda eval squeezenet1_1 2025-09-07T14:22:59.0464568Z pass 2025-09-07T14:23:02.5686565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:23:02.5688436Z import pynvml # type: ignore[import] 2025-09-07T14:23:05.5881341Z 2025-09-07T14:23:07.5975370Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:23:07.5975854Z 2025-09-07T14:23:08.2848978Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T14:27:01.5933361Z W0907 14:27:01.590000 583756 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:27:01.5934789Z W0907 14:27:01.590000 583756 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:27:01.5936184Z W0907 14:27:01.590000 583756 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:27:01.5937102Z W0907 14:27:01.590000 583756 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:27:01.5937877Z W0907 14:27:01.590000 583756 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:27:06.8426232Z pass 2025-09-07T14:27:09.8830986Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:27:09.8832538Z import pynvml # type: ignore[import] 2025-09-07T14:27:12.6720511Z 2025-09-07T14:27:15.0010570Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:27:15.0010970Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:27:15.0104343Z cuda eval yolov3 2025-09-07T14:27:29.7971110Z pass 2025-09-07T14:27:32.1371113Z accuracy pass_rate=88.89% 2025-09-07T14:27:32.1377117Z calls_captured gmean=0.00x mean=362.111x 2025-09-07T14:27:32.1381075Z unique_graphs gmean=0.00x mean=2.389x 2025-09-07T14:27:32.1385154Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:27:32.1389188Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:27:32.1393296Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:27:32.1397394Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:27:32.1401607Z cudagraph_skips gmean=nanx mean=-0.333x 2025-09-07T14:27:32.1403042Z compilation_latency mean=6.948 seconds 2025-09-07T14:27:33.0350149Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cppwrapper-true* ]] 2025-09-07T14:27:33.0351618Z + TORCHINDUCTOR_CPP_WRAPPER=1 2025-09-07T14:27:33.0353432Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:27:33.6227799Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:27:33.6229216Z import pynvml # type: ignore[import] 2025-09-07T14:27:37.3486522Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:27:37.3489260Z import pynvml # type: ignore[import] 2025-09-07T14:27:39.9843255Z 2025-09-07T14:27:40.9609708Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:27:40.9610100Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:27:40.9614086Z cuda eval soft_actor_critic 2025-09-07T14:27:50.3336506Z pass 2025-09-07T14:27:53.0306858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:27:53.0308303Z import pynvml # type: ignore[import] 2025-09-07T14:27:55.6388502Z 2025-09-07T14:27:57.4293433Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:27:57.4293827Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:27:57.4360925Z cuda eval speech_transformer 2025-09-07T14:28:34.1566671Z pass 2025-09-07T14:28:37.2538335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:28:37.2539740Z import pynvml # type: ignore[import] 2025-09-07T14:28:39.8595912Z 2025-09-07T14:28:40.8404705Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:28:40.8405096Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:28:40.8420808Z cuda eval squeezenet1_1 2025-09-07T14:28:53.2745417Z pass 2025-09-07T14:28:55.9915965Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:28:55.9917377Z import pynvml # type: ignore[import] 2025-09-07T14:28:58.5942345Z 2025-09-07T14:29:00.3861233Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:29:00.3861502Z 2025-09-07T14:29:01.0619205Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T14:37:07.0330382Z W0907 14:37:07.030000 587791 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:37:07.0331804Z W0907 14:37:07.030000 587791 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:37:07.0333185Z W0907 14:37:07.030000 587791 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:37:07.0334107Z W0907 14:37:07.030000 587791 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:37:07.0334774Z W0907 14:37:07.030000 587791 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:37:24.9650849Z pass 2025-09-07T14:37:28.5746915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:37:28.5748364Z import pynvml # type: ignore[import] 2025-09-07T14:37:31.1911960Z 2025-09-07T14:37:33.2349113Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:37:33.2350685Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:37:33.2442346Z cuda eval yolov3 2025-09-07T14:38:08.3212527Z pass 2025-09-07T14:38:10.9997749Z accuracy pass_rate=88.89% 2025-09-07T14:38:11.0004006Z calls_captured gmean=0.00x mean=407.778x 2025-09-07T14:38:11.0008678Z unique_graphs gmean=0.00x mean=2.444x 2025-09-07T14:38:11.0012860Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:38:11.0017763Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:38:11.0022153Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:38:11.0026329Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:38:11.0030428Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T14:38:11.0031767Z compilation_latency mean=24.498 seconds 2025-09-07T14:38:11.8701121Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing_cudagraphs-true* ]] 2025-09-07T14:38:11.8702953Z + [[ inference == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T14:38:11.8704552Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --device cuda --total-partitions 6 --partition-id 5 --freezing --output /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:38:12.4273430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:38:12.4275092Z import pynvml # type: ignore[import] 2025-09-07T14:38:15.9778307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:38:15.9779906Z import pynvml # type: ignore[import] 2025-09-07T14:38:18.5889666Z 2025-09-07T14:38:19.5397355Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:38:19.5397739Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:38:19.5399792Z cuda eval soft_actor_critic 2025-09-07T14:38:24.2097958Z pass 2025-09-07T14:38:26.8900357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:38:26.8902314Z import pynvml # type: ignore[import] 2025-09-07T14:38:29.4915440Z 2025-09-07T14:38:31.2077671Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:38:31.2078059Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:38:31.2143893Z cuda eval speech_transformer 2025-09-07T14:38:40.8921777Z W0907 14:38:40.891000 588958 site-packages/torch/_inductor/utils.py:2298] [7/0_1] DeviceCopy in input program 2025-09-07T14:38:42.1151217Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1151628Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1151971Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1152312Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1152639Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1152983Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1153336Z cudagraph partition due to non gpu ops 2025-09-07T14:38:42.1153683Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:38:42.1439238Z cudagraph partition into 2 partitions 2025-09-07T14:38:59.2757662Z W0907 14:38:59.275000 588958 site-packages/torch/_inductor/utils.py:2298] [13/0_1] DeviceCopy in input program 2025-09-07T14:39:00.8835062Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8835725Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8836089Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8836448Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8836804Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8837141Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8837481Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8837803Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8838140Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8838488Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8838836Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8839181Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8839520Z cudagraph partition due to non gpu ops 2025-09-07T14:39:00.8839854Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:39:00.9355022Z cudagraph partition into 2 partitions 2025-09-07T14:39:03.1506817Z pass 2025-09-07T14:39:06.1022284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:39:06.1023841Z import pynvml # type: ignore[import] 2025-09-07T14:39:08.7053529Z 2025-09-07T14:39:09.6858078Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:39:09.6858453Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:39:09.6873844Z cuda eval squeezenet1_1 2025-09-07T14:39:16.8943209Z pass 2025-09-07T14:39:19.4845128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:39:19.4846542Z import pynvml # type: ignore[import] 2025-09-07T14:39:22.0957871Z 2025-09-07T14:39:23.9113932Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:39:23.9114201Z 2025-09-07T14:39:24.2921506Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T14:45:41.8921269Z W0907 14:45:41.889000 595479 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:45:41.8922697Z W0907 14:45:41.889000 595479 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:45:41.8924350Z W0907 14:45:41.889000 595479 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:45:41.8925267Z W0907 14:45:41.889000 595479 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:45:41.8926057Z W0907 14:45:41.889000 595479 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:45:42.1624378Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:45:42.1625274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T14:45:42.1626217Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T14:45:42.1626570Z 2025-09-07T14:45:42.4199243Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:45:42.4200132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T14:45:42.4201063Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T14:45:42.4201429Z 2025-09-07T14:45:54.6671419Z pass 2025-09-07T14:45:58.1802126Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:45:58.1803898Z import pynvml # type: ignore[import] 2025-09-07T14:46:00.7896679Z 2025-09-07T14:46:02.8394115Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:46:02.8394513Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:46:02.8487348Z cuda eval yolov3 2025-09-07T14:46:21.6347652Z cudagraph partition due to non gpu ops 2025-09-07T14:46:21.6348110Z cudagraph partition due to non gpu ops 2025-09-07T14:46:21.6348455Z cudagraph partition due to non gpu ops 2025-09-07T14:46:21.7139044Z cudagraph partition into 2 partitions 2025-09-07T14:46:39.9871147Z pass 2025-09-07T14:46:42.5610366Z accuracy pass_rate=88.89% 2025-09-07T14:46:42.5616919Z calls_captured gmean=0.00x mean=400.389x 2025-09-07T14:46:42.5621807Z unique_graphs gmean=0.00x mean=2.389x 2025-09-07T14:46:42.5626445Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:46:42.5630520Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:46:42.5634799Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:46:42.5638952Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:46:42.5643448Z cudagraph_skips gmean=nanx mean=-0.333x 2025-09-07T14:46:42.5644912Z compilation_latency mean=16.936 seconds 2025-09-07T14:46:43.4107955Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freeze_autotune_cudagraphs-true* ]] 2025-09-07T14:46:43.4109799Z + [[ inference == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T14:46:43.4110129Z + TORCHINDUCTOR_MAX_AUTOTUNE=1 2025-09-07T14:46:43.4111760Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --device cuda --total-partitions 6 --partition-id 5 --freezing --output /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:46:43.9667075Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:46:43.9668479Z import pynvml # type: ignore[import] 2025-09-07T14:46:47.5199923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:46:47.5201322Z import pynvml # type: ignore[import] 2025-09-07T14:46:50.1217256Z 2025-09-07T14:46:51.0757434Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:46:51.0757978Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:46:51.0758323Z cuda eval soft_actor_critic 2025-09-07T14:46:58.6266633Z Autotune Choices Stats: 2025-09-07T14:46:58.6267893Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_0", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T14:46:58.6437849Z AUTOTUNE mm(256x3, 3x1024) 2025-09-07T14:46:58.6438243Z strides: [3, 1], [1, 3] 2025-09-07T14:46:58.6438541Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:46:58.6439337Z triton_mm_0 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:46:58.6440836Z triton_mm_2 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:46:58.6442028Z triton_mm_5 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:46:58.6443586Z triton_mm_6 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:46:58.6444783Z triton_mm_10 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:46:58.6445985Z triton_mm_1 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:46:58.6447160Z triton_mm_3 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:46:58.6448336Z triton_mm_4 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:46:58.6449517Z triton_mm_7 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:46:58.6450796Z triton_mm_8 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:46:58.6451824Z SingleProcess AUTOTUNE benchmarking takes 0.2484 seconds and 0.0015 seconds precompiling for 17 choices 2025-09-07T14:46:59.3485931Z Autotune Choices Stats: 2025-09-07T14:46:59.3487478Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_24", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T14:46:59.3646884Z AUTOTUNE mm(256x1024, 1024x1024) 2025-09-07T14:46:59.3647246Z strides: [1024, 1], [1, 1024] 2025-09-07T14:46:59.3647544Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:46:59.3647862Z mm 0.0143 ms 100.0% 2025-09-07T14:46:59.3648599Z triton_mm_24 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:46:59.3649792Z triton_mm_23 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:46:59.3651232Z triton_mm_20 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:46:59.3652431Z triton_mm_17 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:46:59.3653611Z triton_mm_27 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:46:59.3654771Z triton_mm_18 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:46:59.3655937Z triton_mm_19 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:46:59.3657279Z triton_mm_26 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:46:59.3658568Z triton_mm_29 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:46:59.3659592Z SingleProcess AUTOTUNE benchmarking takes 0.2913 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:47:00.2787690Z Autotune Choices Stats: 2025-09-07T14:47:00.2788925Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_38", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:47:00.2939029Z AUTOTUNE addmm(256x2, 256x1024, 1024x2) 2025-09-07T14:47:00.2946822Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T14:47:00.2947288Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:47:00.2948143Z triton_mm_38 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:00.2949358Z triton_mm_44 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:00.2950732Z triton_mm_36 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T14:47:00.2951483Z bias_addmm 0.0133 ms 76.9% 2025-09-07T14:47:00.2952192Z triton_mm_37 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:00.2953370Z triton_mm_41 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:00.2954546Z triton_mm_49 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:00.2955733Z triton_mm_35 0.0143 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T14:47:00.2956930Z triton_mm_43 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:00.2958200Z triton_mm_48 0.0154 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:00.2959233Z SingleProcess AUTOTUNE benchmarking takes 0.2806 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T14:47:01.1131586Z pass 2025-09-07T14:47:03.7388617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:47:03.7390028Z import pynvml # type: ignore[import] 2025-09-07T14:47:06.3453335Z 2025-09-07T14:47:08.0730650Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:47:08.0731045Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:47:08.0796470Z cuda eval speech_transformer 2025-09-07T14:47:17.9185480Z W0907 14:47:17.917000 597582 site-packages/torch/_inductor/utils.py:2298] [7/0_1] DeviceCopy in input program 2025-09-07T14:47:22.8991592Z Autotune Choices Stats: 2025-09-07T14:47:22.8993116Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_100", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.03174399957060814, "best_triton_pos": 0} 2025-09-07T14:47:22.9155968Z AUTOTUNE mm(2040x512, 512x2048) 2025-09-07T14:47:22.9156269Z strides: [512, 1], [1, 512] 2025-09-07T14:47:22.9156554Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:22.9157339Z triton_mm_100 0.0317 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:22.9158551Z triton_mm_102 0.0317 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:22.9159752Z triton_mm_103 0.0328 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:22.9160942Z triton_mm_107 0.0328 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:22.9162134Z triton_mm_106 0.0338 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:22.9163689Z triton_mm_101 0.0358 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:22.9164869Z triton_mm_104 0.0358 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:22.9166057Z triton_mm_108 0.0369 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:22.9166800Z mm 0.0399 ms 79.5% 2025-09-07T14:47:22.9167482Z triton_mm_97 0.0410 ms 77.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:22.9168504Z SingleProcess AUTOTUNE benchmarking takes 0.3408 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:47:23.5786114Z Autotune Choices Stats: 2025-09-07T14:47:23.5787330Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_84", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T14:47:23.5946982Z AUTOTUNE mm(2040x512, 512x512) 2025-09-07T14:47:23.5947541Z strides: [512, 1], [1, 512] 2025-09-07T14:47:23.5947857Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:23.5948614Z triton_mm_84 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:23.5949833Z triton_mm_90 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:23.5950580Z mm 0.0174 ms 94.1% 2025-09-07T14:47:23.5951276Z triton_mm_82 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:23.5952456Z triton_mm_85 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:23.5953756Z triton_mm_89 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:23.5955012Z triton_mm_80 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:23.5956181Z triton_mm_83 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:23.5957364Z triton_mm_86 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:23.5958551Z triton_mm_88 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:23.5959581Z SingleProcess AUTOTUNE benchmarking takes 0.2770 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:24.7025163Z Autotune Choices Stats: 2025-09-07T14:47:24.7026409Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_9", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T14:47:24.7180698Z AUTOTUNE mm(2040x320, 320x512) 2025-09-07T14:47:24.7181292Z strides: [320, 1], [1, 320] 2025-09-07T14:47:24.7181595Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:24.7182384Z triton_mm_9 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:24.7183597Z triton_mm_11 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:24.7184782Z triton_mm_12 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:24.7185993Z triton_mm_16 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:24.7187192Z triton_mm_17 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:24.7187943Z mm 0.0154 ms 93.3% 2025-09-07T14:47:24.7188626Z triton_mm_7 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:24.7189888Z triton_mm_10 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:24.7191068Z triton_mm_13 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:24.7192257Z triton_mm_15 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:24.7193286Z SingleProcess AUTOTUNE benchmarking takes 0.2684 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:25.0119111Z Autotune Choices Stats: 2025-09-07T14:47:25.0120447Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_33", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.02252800017595291, "best_triton_pos": 0} 2025-09-07T14:47:25.0270869Z AUTOTUNE mm(2040x512, 512x1536) 2025-09-07T14:47:25.0271274Z strides: [512, 1], [1536, 1] 2025-09-07T14:47:25.0271581Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:25.0272366Z triton_mm_33 0.0225 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.0273126Z mm 0.0236 ms 95.7% 2025-09-07T14:47:25.0273812Z triton_mm_34 0.0246 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.0274993Z triton_mm_27 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.0276175Z triton_mm_29 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.0277354Z triton_mm_30 0.0266 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.0278544Z triton_mm_35 0.0276 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:25.0279817Z triton_mm_32 0.0297 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:47:25.0280997Z triton_mm_28 0.0307 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.0282178Z triton_mm_31 0.0317 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.0283436Z SingleProcess AUTOTUNE benchmarking takes 0.3085 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:25.2976768Z Autotune Choices Stats: 2025-09-07T14:47:25.2977984Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_44", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T14:47:25.3127976Z AUTOTUNE bmm(80x204x64, 80x64x204) 2025-09-07T14:47:25.3128322Z strides: [13056, 64, 1], [13056, 1, 64] 2025-09-07T14:47:25.3128642Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:25.3129556Z triton_bmm_44 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:25.3130757Z triton_bmm_43 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:25.3131954Z triton_bmm_45 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.3133144Z triton_bmm_47 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.3134335Z triton_bmm_48 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:25.3135579Z triton_bmm_40 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:25.3136762Z triton_bmm_42 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:25.3138086Z triton_bmm_46 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.3139279Z triton_bmm_49 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.3140474Z triton_bmm_50 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.3141506Z SingleProcess AUTOTUNE benchmarking takes 0.2852 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:47:25.5769124Z Autotune Choices Stats: 2025-09-07T14:47:25.5770274Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_61", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T14:47:25.5921187Z AUTOTUNE bmm(80x204x204, 80x204x64) 2025-09-07T14:47:25.5921538Z strides: [41664, 204, 1], [13056, 64, 1] 2025-09-07T14:47:25.5922042Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:25.5923022Z triton_bmm_61 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:25.5924227Z triton_bmm_64 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.5925425Z triton_bmm_57 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:25.5926612Z triton_bmm_62 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:25.5927792Z triton_bmm_65 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.5928984Z triton_bmm_66 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.5930251Z triton_bmm_68 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.5931457Z triton_bmm_69 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.5932645Z triton_bmm_70 0.0215 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:47:25.5933847Z triton_bmm_71 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.5934886Z SingleProcess AUTOTUNE benchmarking takes 0.2788 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:25.9214796Z Autotune Choices Stats: 2025-09-07T14:47:25.9216370Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.035840000957250595, "best_triton_pos": 1, "best_triton_time": 0.04095999896526337, "best_triton_kernel": "triton_mm_120", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:47:25.9374929Z AUTOTUNE mm(2040x2048, 2048x512) 2025-09-07T14:47:25.9375254Z strides: [2048, 1], [1, 2048] 2025-09-07T14:47:25.9375558Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:25.9375871Z mm 0.0358 ms 100.0% 2025-09-07T14:47:25.9376571Z triton_mm_120 0.0410 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.9377832Z triton_mm_125 0.0420 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.9379036Z triton_mm_116 0.0440 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:25.9380227Z triton_mm_117 0.0440 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:25.9381422Z triton_mm_126 0.0451 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:25.9382747Z triton_mm_121 0.0461 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.9383939Z triton_mm_118 0.0481 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.9385115Z triton_mm_119 0.0512 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:25.9386312Z triton_mm_124 0.0512 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:25.9387346Z SingleProcess AUTOTUNE benchmarking takes 0.3448 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:47:26.0392153Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0392542Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0392889Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0393215Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0393559Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0393896Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0394232Z cudagraph partition due to non gpu ops 2025-09-07T14:47:26.0394691Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:47:26.0691894Z cudagraph partition into 2 partitions 2025-09-07T14:47:44.0593615Z W0907 14:47:44.058000 597582 site-packages/torch/_inductor/utils.py:2298] [13/0_1] DeviceCopy in input program 2025-09-07T14:47:50.3698798Z Autotune Choices Stats: 2025-09-07T14:47:50.3700314Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_mm_881", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:47:50.3863679Z AUTOTUNE mm(220x512, 512x2048) 2025-09-07T14:47:50.3864029Z strides: [512, 1], [1, 512] 2025-09-07T14:47:50.3864342Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:50.3864662Z mm 0.0133 ms 100.0% 2025-09-07T14:47:50.3865683Z triton_mm_881 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:50.3867025Z triton_mm_877 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:50.3868217Z triton_mm_880 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:50.3869400Z triton_mm_883 0.0143 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:50.3870590Z triton_mm_871 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:50.3871778Z triton_mm_878 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:50.3872962Z triton_mm_879 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:50.3874143Z triton_mm_882 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:50.3875442Z triton_mm_886 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:50.3876463Z SingleProcess AUTOTUNE benchmarking takes 0.2776 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T14:47:51.0406407Z Autotune Choices Stats: 2025-09-07T14:47:51.0407665Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_716", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:47:51.0572230Z AUTOTUNE mm(220x512, 512x512) 2025-09-07T14:47:51.0572555Z strides: [512, 1], [1, 512] 2025-09-07T14:47:51.0572857Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:51.0573625Z triton_mm_716 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:51.0574403Z mm 0.0113 ms 90.9% 2025-09-07T14:47:51.0575104Z triton_mm_713 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:51.0576524Z triton_mm_714 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:51.0577773Z triton_mm_715 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:51.0578951Z triton_mm_719 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:51.0580135Z triton_mm_720 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:51.0581326Z triton_mm_723 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:51.0582630Z triton_mm_722 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:51.0583908Z triton_mm_725 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:51.0584927Z SingleProcess AUTOTUNE benchmarking takes 0.2690 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:47:52.1047919Z Autotune Choices Stats: 2025-09-07T14:47:52.1049447Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.01228800043463707, "best_triton_kernel": "triton_mm_679", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T14:47:52.1208235Z AUTOTUNE mm(220x512, 512x1536) 2025-09-07T14:47:52.1208556Z strides: [512, 1], [1536, 1] 2025-09-07T14:47:52.1208861Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:52.1209184Z mm 0.0123 ms 100.0% 2025-09-07T14:47:52.1209878Z triton_mm_679 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:52.1211077Z triton_mm_680 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.1212532Z triton_mm_683 0.0133 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.1213722Z triton_mm_673 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.1214916Z triton_mm_674 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.1216094Z triton_mm_675 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.1217260Z triton_mm_682 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:52.1218531Z triton_mm_685 0.0143 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:52.1219813Z triton_mm_676 0.0154 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.1220850Z SingleProcess AUTOTUNE benchmarking takes 0.2695 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:52.2677916Z Autotune Choices Stats: 2025-09-07T14:47:52.2679076Z {"num_choices": 11, "num_triton_choices": 10, "best_kernel": "triton_bmm_691", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T14:47:52.2832385Z AUTOTUNE bmm(80x22x64, 80x64x22) 2025-09-07T14:47:52.2832694Z strides: [1408, 64, 1], [1408, 1, 64] 2025-09-07T14:47:52.2833021Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:52.2833793Z triton_bmm_691 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.2835167Z triton_bmm_692 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.2836460Z triton_bmm_693 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.2837665Z triton_bmm_696 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.2838855Z triton_bmm_697 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.2840059Z triton_bmm_698 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:52.2841257Z triton_bmm_690 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:47:52.2842448Z triton_bmm_694 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.2843851Z triton_bmm_695 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.2845119Z triton_bmm_699 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:52.2846156Z SingleProcess AUTOTUNE benchmarking takes 0.1615 seconds and 0.0002 seconds precompiling for 11 choices 2025-09-07T14:47:52.4538876Z Autotune Choices Stats: 2025-09-07T14:47:52.4540053Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_bmm_701", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T14:47:52.4697153Z AUTOTUNE bmm(80x22x22, 80x22x64) 2025-09-07T14:47:52.4697561Z strides: [484, 22, 1], [1408, 64, 1] 2025-09-07T14:47:52.4697883Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:52.4698699Z triton_bmm_701 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.4699925Z triton_bmm_702 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.4701376Z triton_bmm_703 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.4702591Z triton_bmm_706 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:52.4703795Z triton_bmm_709 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:52.4704999Z triton_bmm_711 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T14:47:52.4705739Z bmm 0.0082 ms 87.5% 2025-09-07T14:47:52.4706522Z triton_bmm_700 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:47:52.4707727Z triton_bmm_704 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.4708998Z triton_bmm_705 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.4710027Z SingleProcess AUTOTUNE benchmarking takes 0.1859 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T14:47:52.7226963Z Autotune Choices Stats: 2025-09-07T14:47:52.7228116Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_833", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:47:52.7385978Z AUTOTUNE bmm(80x22x64, 80x64x204) 2025-09-07T14:47:52.7386322Z strides: [1408, 64, 1], [13056, 1, 64] 2025-09-07T14:47:52.7386653Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:52.7387439Z triton_bmm_833 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.7388634Z triton_bmm_820 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:47:52.7389940Z triton_bmm_821 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.7391127Z triton_bmm_822 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.7392320Z triton_bmm_823 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.7393519Z triton_bmm_824 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.7394711Z triton_bmm_825 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.7395897Z triton_bmm_826 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.7397153Z triton_bmm_827 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:52.7398328Z triton_bmm_828 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.7399356Z SingleProcess AUTOTUNE benchmarking takes 0.2463 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T14:47:52.9500031Z Autotune Choices Stats: 2025-09-07T14:47:52.9501215Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_839", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T14:47:52.9660588Z AUTOTUNE bmm(80x22x204, 80x204x64) 2025-09-07T14:47:52.9660924Z strides: [4544, 204, 1], [13056, 64, 1] 2025-09-07T14:47:52.9661261Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:52.9662174Z triton_bmm_839 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.9663405Z triton_bmm_844 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:52.9664695Z triton_bmm_845 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.9665920Z triton_bmm_846 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.9667122Z triton_bmm_847 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:52.9668345Z triton_bmm_848 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:52.9669562Z triton_bmm_849 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:47:52.9670784Z triton_bmm_851 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:52.9672064Z triton_bmm_838 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:52.9673267Z triton_bmm_840 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:52.9674302Z SingleProcess AUTOTUNE benchmarking takes 0.2269 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T14:47:53.2629246Z Autotune Choices Stats: 2025-09-07T14:47:53.2630401Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_892", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T14:47:53.2791334Z AUTOTUNE mm(220x2048, 2048x512) 2025-09-07T14:47:53.2791656Z strides: [2048, 1], [1, 2048] 2025-09-07T14:47:53.2791968Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:53.2792735Z triton_mm_892 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:53.2793472Z mm 0.0184 ms 83.3% 2025-09-07T14:47:53.2794268Z triton_mm_896 0.0184 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:53.2795466Z triton_mm_889 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:53.2796644Z triton_mm_890 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:53.2797821Z triton_mm_891 0.0225 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:53.2798994Z triton_mm_895 0.0246 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:53.2800220Z triton_mm_899 0.0276 ms 55.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:53.2801469Z triton_mm_898 0.0317 ms 48.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:53.2802649Z triton_mm_901 0.0328 ms 46.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:53.2803904Z SingleProcess AUTOTUNE benchmarking takes 0.3125 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T14:47:53.6898953Z Autotune Choices Stats: 2025-09-07T14:47:53.6900143Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_1724", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T14:47:53.7068565Z AUTOTUNE mm(220x512, 512x1014) 2025-09-07T14:47:53.7068888Z strides: [512, 1], [1, 512] 2025-09-07T14:47:53.7069175Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:47:53.7069964Z triton_mm_1724 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:53.7071172Z triton_mm_1723 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:47:53.7072498Z triton_mm_1717 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:47:53.7073704Z triton_mm_1727 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:47:53.7074889Z triton_mm_1718 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:53.7076062Z triton_mm_1719 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:47:53.7077256Z triton_mm_1720 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:47:53.7078450Z triton_mm_1726 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:53.7079706Z triton_mm_1729 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:47:53.7080464Z mm 0.0154 ms 73.3% 2025-09-07T14:47:53.7080997Z SingleProcess AUTOTUNE benchmarking takes 0.2705 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:47:53.7303811Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7304191Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7304536Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7304879Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7305222Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7305547Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7305883Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7306217Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7306559Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7306888Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7307336Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7307679Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7308018Z cudagraph partition due to non gpu ops 2025-09-07T14:47:53.7308458Z cudagraph partition due to DeviceCopy ops 2025-09-07T14:47:53.7788212Z cudagraph partition into 2 partitions 2025-09-07T14:47:56.7371742Z pass 2025-09-07T14:48:00.0338662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:48:00.0340120Z import pynvml # type: ignore[import] 2025-09-07T14:48:02.6333148Z 2025-09-07T14:48:03.6199974Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:48:03.6200356Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:48:03.6216072Z cuda eval squeezenet1_1 2025-09-07T14:48:15.0512353Z Autotune Choices Stats: 2025-09-07T14:48:15.0513838Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_7", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T14:48:15.1045631Z AUTOTUNE addmm(12100x16, 12100x64, 64x16) 2025-09-07T14:48:15.1046185Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T14:48:15.1046624Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:15.1047919Z triton_mm_7 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T14:48:15.1049182Z triton_mm_8 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T14:48:15.1050447Z triton_mm_9 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:15.1051700Z triton_mm_10 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:15.1052973Z triton_mm_13 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.1054221Z triton_mm_14 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.1055570Z triton_mm_15 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:15.1056835Z triton_mm_16 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:15.1058178Z triton_mm_17 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.1059435Z triton_mm_18 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:15.1060539Z SingleProcess AUTOTUNE benchmarking takes 0.3618 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T14:48:15.7768396Z Autotune Choices Stats: 2025-09-07T14:48:15.7769894Z {"num_choices": 18, "num_triton_choices": 16, "best_kernel": "triton_mm_45", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T14:48:15.7943375Z AUTOTUNE addmm(12100x16, 12100x128, 128x16) 2025-09-07T14:48:15.7943880Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T14:48:15.7944253Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:15.7945085Z triton_mm_45 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T14:48:15.7946284Z triton_mm_50 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.7947465Z triton_mm_51 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.7948637Z triton_mm_52 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:15.7949826Z triton_mm_54 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.7951030Z triton_mm_55 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:15.7952325Z triton_mm_57 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:15.7953512Z triton_mm_43 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:48:15.7954687Z triton_mm_46 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:15.7955871Z triton_mm_47 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:15.7956886Z SingleProcess AUTOTUNE benchmarking takes 0.2844 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T14:48:16.4032055Z Autotune Choices Stats: 2025-09-07T14:48:16.4033269Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_82", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T14:48:16.4196492Z AUTOTUNE addmm(2916x32, 2916x128, 128x32) 2025-09-07T14:48:16.4197145Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T14:48:16.4197544Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:16.4198368Z triton_mm_82 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:16.4199590Z triton_mm_84 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:16.4200784Z triton_mm_81 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:16.4201956Z triton_mm_83 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:16.4203444Z triton_mm_87 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:16.4204703Z triton_mm_88 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.4205876Z triton_mm_89 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:16.4207025Z triton_mm_90 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.4208201Z triton_mm_91 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:16.4209378Z triton_mm_92 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.4210398Z SingleProcess AUTOTUNE benchmarking takes 0.2764 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:48:16.9214443Z Autotune Choices Stats: 2025-09-07T14:48:16.9215670Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_122", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:48:16.9383352Z AUTOTUNE addmm(2916x32, 2916x256, 256x32) 2025-09-07T14:48:16.9383716Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:48:16.9384092Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:16.9384947Z triton_mm_122 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:16.9386151Z triton_mm_123 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:16.9387349Z triton_mm_124 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:16.9388523Z triton_mm_125 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:16.9389715Z triton_mm_128 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:16.9391041Z triton_mm_130 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:16.9392236Z triton_mm_131 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.9393431Z triton_mm_132 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:16.9394630Z triton_mm_136 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.9395816Z triton_mm_129 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:16.9396935Z SingleProcess AUTOTUNE benchmarking takes 0.2798 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:48:17.5717396Z Autotune Choices Stats: 2025-09-07T14:48:17.5718589Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_254", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T14:48:17.5878426Z AUTOTUNE addmm(676x64, 676x384, 384x64) 2025-09-07T14:48:17.5878799Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T14:48:17.5879176Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:17.5880006Z triton_mm_254 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:17.5881219Z triton_mm_251 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:17.5882405Z triton_mm_252 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:17.5883791Z triton_mm_253 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:17.5884977Z triton_mm_258 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:17.5886330Z triton_mm_262 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:17.5887508Z triton_mm_257 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:17.5888682Z triton_mm_260 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:17.5889869Z triton_mm_261 0.0113 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:17.5890622Z bias_addmm 0.0123 ms 75.0% 2025-09-07T14:48:17.5891189Z SingleProcess AUTOTUNE benchmarking takes 0.2935 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:48:18.1104663Z Autotune Choices Stats: 2025-09-07T14:48:18.1106195Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_298", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:48:18.1271949Z AUTOTUNE addmm(676x64, 676x512, 512x64) 2025-09-07T14:48:18.1272303Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T14:48:18.1272666Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:18.1273507Z triton_mm_298 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:18.1274706Z triton_mm_295 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:18.1275910Z triton_mm_296 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:18.1277286Z triton_mm_297 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:18.1278473Z triton_mm_302 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:18.1279768Z triton_mm_306 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:18.1280519Z bias_addmm 0.0123 ms 83.3% 2025-09-07T14:48:18.1281242Z triton_mm_301 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:18.1282417Z triton_mm_305 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:18.1283827Z triton_mm_304 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:18.1284859Z SingleProcess AUTOTUNE benchmarking takes 0.3043 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:48:18.8694796Z Autotune Choices Stats: 2025-09-07T14:48:18.8695995Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_163", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.009216000325977802, "best_triton_pos": 0} 2025-09-07T14:48:18.8861586Z AUTOTUNE addmm(676x48, 676x256, 256x48) 2025-09-07T14:48:18.8861940Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:48:18.8862324Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:18.8863171Z triton_mm_163 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:18.8864367Z triton_mm_164 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:18.8865553Z triton_mm_165 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:18.8866744Z triton_mm_166 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:18.8867498Z bias_addmm 0.0102 ms 90.0% 2025-09-07T14:48:18.8868337Z triton_mm_169 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:18.8869525Z triton_mm_170 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:18.8870693Z triton_mm_171 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:18.8871874Z triton_mm_172 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:18.8873042Z triton_mm_173 0.0102 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:18.8874060Z SingleProcess AUTOTUNE benchmarking takes 0.2913 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:48:19.3963513Z Autotune Choices Stats: 2025-09-07T14:48:19.3972850Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_207", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T14:48:19.4132678Z AUTOTUNE addmm(676x48, 676x384, 384x48) 2025-09-07T14:48:19.4133027Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T14:48:19.4133404Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:19.4134234Z triton_mm_207 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:19.4135446Z triton_mm_208 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:19.4136647Z triton_mm_209 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:19.4137901Z triton_mm_210 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:19.4139092Z triton_mm_218 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:48:19.4140063Z bias_addmm 0.0113 ms 90.9% 2025-09-07T14:48:19.4140769Z triton_mm_213 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:19.4141969Z triton_mm_214 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:19.4143145Z triton_mm_216 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:19.4144319Z triton_mm_217 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:19.4145344Z SingleProcess AUTOTUNE benchmarking takes 0.2948 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:48:19.9349109Z Autotune Choices Stats: 2025-09-07T14:48:19.9350635Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_0", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.025599999353289604, "best_triton_pos": 0} 2025-09-07T14:48:19.9529247Z AUTOTUNE convolution(4x3x224x224, 64x3x3x3) 2025-09-07T14:48:19.9529639Z strides: [150528, 1, 672, 3], [27, 1, 9, 3] 2025-09-07T14:48:19.9529981Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:48:19.9530907Z triton_convolution2d_0 0.0256 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:19.9532408Z triton_convolution2d_3 0.0256 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:19.9533903Z triton_convolution2d_4 0.0287 ms 89.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:19.9535590Z triton_convolution2d_5 0.0287 ms 89.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:19.9536585Z convolution 0.0338 ms 75.8% 2025-09-07T14:48:19.9537450Z triton_convolution2d_2 0.0338 ms 75.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:48:19.9539007Z triton_convolution2d_1 0.0543 ms 47.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:19.9540184Z SingleProcess AUTOTUNE benchmarking takes 0.1742 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T14:48:20.1875629Z Autotune Choices Stats: 2025-09-07T14:48:20.1876797Z {"num_choices": 17, "num_triton_choices": 15, "best_kernel": "triton_mm_23", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T14:48:20.2037177Z AUTOTUNE addmm(12100x64, 12100x16, 16x64) 2025-09-07T14:48:20.2037519Z strides: [0, 1], [16, 1], [1, 16] 2025-09-07T14:48:20.2037874Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:20.2038705Z triton_mm_23 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.2040085Z triton_mm_24 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:20.2041280Z triton_mm_25 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:20.2042466Z triton_mm_26 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:20.2043859Z triton_mm_27 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.2045031Z triton_mm_28 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:20.2046216Z triton_mm_29 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:20.2047480Z triton_mm_31 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:20.2048670Z triton_mm_33 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:20.2049866Z triton_mm_34 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:20.2050912Z SingleProcess AUTOTUNE benchmarking takes 0.2503 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T14:48:20.3044227Z Autotune Choices Stats: 2025-09-07T14:48:20.3045955Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.013311999849975109, "best_triton_kernel": "triton_convolution2d_37", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:48:20.3208249Z AUTOTUNE convolution(4x16x55x55, 64x16x3x3) 2025-09-07T14:48:20.3208628Z strides: [48400, 1, 880, 16], [144, 1, 48, 16] 2025-09-07T14:48:20.3209001Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:48:20.3209343Z convolution 0.0123 ms 100.0% 2025-09-07T14:48:20.3210237Z triton_convolution2d_37 0.0133 ms 92.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.3211729Z triton_convolution2d_41 0.0143 ms 85.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.3213202Z triton_convolution2d_38 0.0154 ms 80.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.3214693Z triton_convolution2d_40 0.0154 ms 80.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.3216170Z triton_convolution2d_42 0.0154 ms 80.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.3217839Z triton_convolution2d_39 0.0195 ms 63.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:48:20.3219023Z SingleProcess AUTOTUNE benchmarking takes 0.1166 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T14:48:20.8318346Z Autotune Choices Stats: 2025-09-07T14:48:20.8319560Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_97", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T14:48:20.8488918Z AUTOTUNE addmm(2916x128, 2916x32, 32x128) 2025-09-07T14:48:20.8489326Z strides: [0, 1], [32, 1], [1, 32] 2025-09-07T14:48:20.8489685Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:20.8490505Z triton_mm_97 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:48:20.8491892Z triton_mm_98 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.8493084Z triton_mm_99 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:20.8494277Z triton_mm_100 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:20.8495464Z triton_mm_101 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:20.8496642Z triton_mm_102 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.8497989Z triton_mm_103 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.8499185Z triton_mm_104 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:20.8500460Z triton_mm_105 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:20.8501655Z triton_mm_106 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:20.8502694Z SingleProcess AUTOTUNE benchmarking takes 0.5178 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T14:48:20.9666542Z Autotune Choices Stats: 2025-09-07T14:48:20.9668256Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_convolution2d_118", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:48:20.9831774Z AUTOTUNE convolution(4x32x27x27, 128x32x3x3) 2025-09-07T14:48:20.9832210Z strides: [23328, 1, 864, 32], [288, 1, 96, 32] 2025-09-07T14:48:20.9832566Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:48:20.9832890Z convolution 0.0113 ms 100.0% 2025-09-07T14:48:20.9833902Z triton_convolution2d_118 0.0143 ms 78.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.9835404Z triton_convolution2d_119 0.0154 ms 73.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.9836911Z triton_convolution2d_114 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.9838403Z triton_convolution2d_120 0.0174 ms 64.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.9839904Z triton_convolution2d_117 0.0184 ms 61.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:20.9841457Z triton_convolution2d_115 0.0215 ms 52.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:20.9843139Z triton_convolution2d_116 0.0297 ms 37.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:48:20.9844323Z SingleProcess AUTOTUNE benchmarking takes 0.1338 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:48:21.2798723Z Autotune Choices Stats: 2025-09-07T14:48:21.2799947Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_184", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0071680000983178616, "best_triton_pos": 0} 2025-09-07T14:48:21.2967301Z AUTOTUNE addmm(676x192, 676x48, 48x192) 2025-09-07T14:48:21.2967650Z strides: [0, 1], [48, 1], [1, 48] 2025-09-07T14:48:21.2968066Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:21.2972784Z triton_mm_184 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:21.2974180Z triton_mm_180 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:48:21.2975364Z triton_mm_181 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.2976558Z triton_mm_182 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:21.2977838Z triton_mm_183 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:21.2979025Z triton_mm_186 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.2980210Z triton_mm_187 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:21.2981387Z triton_mm_188 0.0082 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:21.2982228Z bias_addmm 0.0092 ms 77.8% 2025-09-07T14:48:21.2982944Z triton_mm_185 0.0092 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.2983975Z SingleProcess AUTOTUNE benchmarking takes 0.3033 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T14:48:21.4262707Z Autotune Choices Stats: 2025-09-07T14:48:21.4264414Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_convolution2d_203", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:48:21.4434968Z AUTOTUNE convolution(4x48x13x13, 192x48x3x3) 2025-09-07T14:48:21.4435354Z strides: [8112, 1, 624, 48], [432, 1, 144, 48] 2025-09-07T14:48:21.4435758Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:48:21.4436071Z convolution 0.0123 ms 100.0% 2025-09-07T14:48:21.4437079Z triton_convolution2d_203 0.0205 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.4438591Z triton_convolution2d_202 0.0266 ms 46.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.4440096Z triton_convolution2d_204 0.0297 ms 41.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.4441590Z triton_convolution2d_199 0.0307 ms 40.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.4443350Z triton_convolution2d_205 0.0307 ms 40.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.4444850Z triton_convolution2d_200 0.0338 ms 36.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.4446410Z triton_convolution2d_201 0.0440 ms 27.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:48:21.4447583Z SingleProcess AUTOTUNE benchmarking takes 0.1461 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:48:21.7388206Z Autotune Choices Stats: 2025-09-07T14:48:21.7389412Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_268", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T14:48:21.7560980Z AUTOTUNE addmm(676x256, 676x64, 64x256) 2025-09-07T14:48:21.7561339Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T14:48:21.7561715Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:21.7562551Z triton_mm_268 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T14:48:21.7563935Z triton_mm_269 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.7565372Z triton_mm_270 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:21.7566567Z triton_mm_271 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:48:21.7567752Z triton_mm_272 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:21.7568933Z triton_mm_274 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.7570115Z triton_mm_275 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:21.7571309Z triton_mm_276 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:21.7572066Z bias_addmm 0.0092 ms 88.9% 2025-09-07T14:48:21.7572838Z triton_mm_273 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.7573884Z SingleProcess AUTOTUNE benchmarking takes 0.3021 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T14:48:21.8846865Z Autotune Choices Stats: 2025-09-07T14:48:21.8848505Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_convolution2d_291", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:48:21.9013404Z AUTOTUNE convolution(4x64x13x13, 256x64x3x3) 2025-09-07T14:48:21.9013796Z strides: [10816, 1, 832, 64], [576, 1, 192, 64] 2025-09-07T14:48:21.9014147Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:48:21.9014619Z convolution 0.0123 ms 100.0% 2025-09-07T14:48:21.9015514Z triton_convolution2d_291 0.0205 ms 60.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.9017088Z triton_convolution2d_290 0.0287 ms 42.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.9018651Z triton_convolution2d_292 0.0297 ms 41.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.9020157Z triton_convolution2d_293 0.0328 ms 37.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:48:21.9021633Z triton_convolution2d_288 0.0379 ms 32.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.9023122Z triton_convolution2d_287 0.0399 ms 30.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:48:21.9024693Z triton_convolution2d_289 0.0553 ms 22.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:48:21.9025881Z SingleProcess AUTOTUNE benchmarking takes 0.1448 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:48:22.2007284Z Autotune Choices Stats: 2025-09-07T14:48:22.2008776Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_349", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T14:48:22.2179300Z AUTOTUNE addmm(676x1000, 676x512, 512x1000) 2025-09-07T14:48:22.2179647Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T14:48:22.2180012Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:48:22.2180401Z bias_addmm 0.0143 ms 100.0% 2025-09-07T14:48:22.2181138Z triton_mm_349 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:22.2182514Z triton_mm_345 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:48:22.2183705Z triton_mm_348 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:22.2184883Z triton_mm_351 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:48:22.2186072Z triton_mm_347 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:22.2187255Z triton_mm_350 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:22.2188003Z addmm 0.0184 ms 77.8% 2025-09-07T14:48:22.2188778Z triton_mm_346 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:48:22.2190036Z triton_mm_354 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:48:22.2191062Z SingleProcess AUTOTUNE benchmarking takes 0.3062 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:48:25.8773918Z pass 2025-09-07T14:48:28.7878514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:48:28.7879930Z import pynvml # type: ignore[import] 2025-09-07T14:48:31.3929954Z 2025-09-07T14:48:33.1918674Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:48:33.1918957Z 2025-09-07T14:48:33.3629265Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T14:58:11.9376107Z W0907 14:58:11.934000 641252 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T14:58:11.9377517Z W0907 14:58:11.934000 641252 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T14:58:11.9379063Z W0907 14:58:11.934000 641252 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T14:58:11.9379982Z W0907 14:58:11.934000 641252 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:58:11.9380663Z W0907 14:58:11.934000 641252 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T14:58:12.2061664Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:58:12.2062544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T14:58:12.2063438Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T14:58:12.2063818Z 2025-09-07T14:58:12.4604916Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T14:58:12.4605800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T14:58:12.4606732Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T14:58:12.4607351Z 2025-09-07T14:58:24.9727170Z pass 2025-09-07T14:58:28.7124170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:58:28.7125972Z import pynvml # type: ignore[import] 2025-09-07T14:58:31.3147187Z 2025-09-07T14:58:33.3884613Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:58:33.3885020Z loading model: 0it [00:02, ?it/s] 2025-09-07T14:58:33.4011192Z cuda eval yolov3 2025-09-07T14:59:02.4517415Z Autotune Choices Stats: 2025-09-07T14:59:02.4520045Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_17", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.03788800165057182, "best_triton_pos": 0} 2025-09-07T14:59:02.4787152Z AUTOTUNE addmm(196608x32, 196608x64, 64x32) 2025-09-07T14:59:02.4787566Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T14:59:02.4787956Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:02.4788790Z triton_mm_17 0.0379 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:02.4789979Z triton_mm_20 0.0379 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:02.4791431Z triton_mm_23 0.0379 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:02.4792621Z triton_mm_24 0.0379 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:59:02.4793834Z triton_mm_28 0.0379 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:02.4795028Z triton_mm_29 0.0399 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:02.4796207Z triton_mm_14 0.0410 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:02.4797382Z triton_mm_25 0.0430 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:02.4798659Z triton_mm_19 0.0440 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:02.4799836Z triton_mm_21 0.0440 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:02.4800869Z SingleProcess AUTOTUNE benchmarking takes 0.3619 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T14:59:03.1072267Z Autotune Choices Stats: 2025-09-07T14:59:03.1073490Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_55", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.025599999353289604, "best_triton_pos": 0} 2025-09-07T14:59:03.1328399Z AUTOTUNE addmm(49152x64, 49152x128, 128x64) 2025-09-07T14:59:03.1328783Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T14:59:03.1330274Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:03.1331140Z triton_mm_55 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.1332482Z triton_mm_60 0.0256 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.1333669Z triton_mm_51 0.0266 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:03.1334858Z triton_mm_61 0.0266 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:03.1336041Z triton_mm_52 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:03.1337202Z triton_mm_53 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.1338470Z triton_mm_56 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T14:59:03.1339645Z triton_mm_57 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.1340927Z triton_mm_58 0.0276 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:03.1341691Z bias_addmm 0.0287 ms 89.3% 2025-09-07T14:59:03.1342261Z SingleProcess AUTOTUNE benchmarking takes 0.3395 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:59:03.7366046Z Autotune Choices Stats: 2025-09-07T14:59:03.7367266Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_111", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T14:59:03.7620064Z AUTOTUNE addmm(12288x128, 12288x256, 256x128) 2025-09-07T14:59:03.7620438Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:59:03.7620813Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:03.7621662Z triton_mm_111 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:03.7623110Z triton_mm_112 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.7624317Z triton_mm_113 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.7625521Z triton_mm_114 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:03.7626730Z triton_mm_117 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.7627928Z triton_mm_110 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.7629237Z triton_mm_116 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:03.7630434Z triton_mm_118 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:03.7631278Z bias_addmm 0.0195 ms 89.5% 2025-09-07T14:59:03.7631998Z triton_mm_107 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:03.7633028Z SingleProcess AUTOTUNE benchmarking takes 0.3239 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:04.5146721Z Autotune Choices Stats: 2025-09-07T14:59:04.5147970Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_866", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.021503999829292297, "best_triton_pos": 0} 2025-09-07T14:59:04.5395030Z AUTOTUNE addmm(12288x128, 12288x384, 384x128) 2025-09-07T14:59:04.5395440Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T14:59:04.5395817Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:04.5396650Z triton_mm_866 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.5398095Z triton_mm_871 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.5399309Z triton_mm_872 0.0215 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:04.5400524Z triton_mm_864 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.5401710Z triton_mm_865 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:04.5403102Z triton_mm_867 0.0225 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.5403877Z bias_addmm 0.0236 ms 91.3% 2025-09-07T14:59:04.5404587Z triton_mm_868 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:04.5405878Z triton_mm_870 0.0236 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.5407078Z triton_mm_862 0.0246 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:04.5408113Z SingleProcess AUTOTUNE benchmarking takes 0.3344 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:59:04.9124353Z Autotune Choices Stats: 2025-09-07T14:59:04.9125525Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_319", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.014336000196635723, "best_triton_pos": 0} 2025-09-07T14:59:04.9378592Z AUTOTUNE addmm(3072x256, 3072x512, 512x256) 2025-09-07T14:59:04.9378953Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T14:59:04.9379327Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:04.9380348Z triton_mm_319 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.9381637Z triton_mm_315 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:04.9382817Z triton_mm_318 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:04.9384011Z triton_mm_321 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:04.9385207Z triton_mm_317 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.9386396Z triton_mm_320 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.9387586Z triton_mm_325 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:04.9388335Z bias_addmm 0.0184 ms 77.8% 2025-09-07T14:59:04.9389062Z triton_mm_316 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:04.9390344Z triton_mm_324 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:04.9391382Z SingleProcess AUTOTUNE benchmarking takes 0.3273 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:05.6868836Z Autotune Choices Stats: 2025-09-07T14:59:05.6870042Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_755", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T14:59:05.7120597Z AUTOTUNE addmm(3072x256, 3072x768, 768x256) 2025-09-07T14:59:05.7120964Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T14:59:05.7121358Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:05.7122214Z triton_mm_755 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:05.7123850Z triton_mm_751 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:05.7125052Z triton_mm_757 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:05.7126280Z triton_mm_754 0.0195 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:05.7127040Z bias_addmm 0.0215 ms 81.0% 2025-09-07T14:59:05.7127756Z triton_mm_752 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:05.7128957Z triton_mm_761 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:05.7130252Z triton_mm_753 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:05.7131449Z triton_mm_756 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:05.7132726Z triton_mm_760 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:05.7133761Z SingleProcess AUTOTUNE benchmarking takes 0.3305 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:06.0880467Z Autotune Choices Stats: 2025-09-07T14:59:06.0881673Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_523", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015359999611973763, "best_triton_pos": 0} 2025-09-07T14:59:06.1137591Z AUTOTUNE addmm(768x512, 768x1024, 1024x512) 2025-09-07T14:59:06.1137995Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T14:59:06.1138381Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:06.1139219Z triton_mm_523 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:06.1139985Z bias_addmm 0.0174 ms 88.2% 2025-09-07T14:59:06.1140708Z triton_mm_522 0.0174 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:06.1142106Z triton_mm_519 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:06.1143306Z triton_mm_526 0.0195 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:06.1144496Z triton_mm_516 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:06.1145681Z triton_mm_517 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:06.1146859Z triton_mm_528 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:06.1147608Z addmm 0.0215 ms 71.4% 2025-09-07T14:59:06.1148383Z triton_mm_518 0.0215 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:06.1149410Z SingleProcess AUTOTUNE benchmarking takes 0.3303 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:59:06.8306817Z Autotune Choices Stats: 2025-09-07T14:59:06.8308054Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_666", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T14:59:06.8564436Z AUTOTUNE addmm(768x512, 768x2048, 2048x512) 2025-09-07T14:59:06.8564853Z strides: [0, 1], [2048, 1], [1, 2048] 2025-09-07T14:59:06.8565239Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:06.8566072Z triton_mm_666 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:06.8566852Z bias_addmm 0.0236 ms 87.0% 2025-09-07T14:59:06.8567394Z addmm 0.0266 ms 76.9% 2025-09-07T14:59:06.8568106Z triton_mm_665 0.0266 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:06.8569406Z triton_mm_662 0.0276 ms 74.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:06.8570586Z triton_mm_669 0.0307 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:06.8571777Z triton_mm_659 0.0328 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:06.8572969Z triton_mm_660 0.0338 ms 60.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:06.8574138Z triton_mm_661 0.0338 ms 60.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:06.8575326Z triton_mm_668 0.0338 ms 60.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:06.8576363Z SingleProcess AUTOTUNE benchmarking takes 0.3656 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:07.6680678Z Autotune Choices Stats: 2025-09-07T14:59:07.6682407Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.1812479943037033, "best_triton_pos": 1, "best_triton_time": 0.19763199985027313, "best_triton_kernel": "triton_convolution2d_2", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8"} 2025-09-07T14:59:07.6934689Z AUTOTUNE convolution(4x3x384x512, 32x3x3x3) 2025-09-07T14:59:07.6935075Z strides: [589824, 1, 1536, 3], [27, 1, 9, 3] 2025-09-07T14:59:07.6935423Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:07.6935739Z convolution 0.1812 ms 100.0% 2025-09-07T14:59:07.6936630Z triton_convolution2d_2 0.1976 ms 91.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:07.6938177Z triton_convolution2d_4 0.2284 ms 79.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.6939908Z triton_convolution2d_0 0.2417 ms 75.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:07.6941390Z triton_convolution2d_3 0.2632 ms 68.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.6942876Z triton_convolution2d_1 0.3287 ms 55.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:07.6944356Z triton_convolution2d_5 0.3615 ms 50.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.6945521Z SingleProcess AUTOTUNE benchmarking takes 0.2889 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T14:59:07.9078598Z Autotune Choices Stats: 2025-09-07T14:59:07.9080420Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.09216000139713287, "best_triton_pos": 1, "best_triton_time": 0.10342399775981903, "best_triton_kernel": "triton_convolution2d_9", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:07.9329989Z AUTOTUNE convolution(4x32x384x512, 64x32x3x3) 2025-09-07T14:59:07.9330385Z strides: [6291456, 1, 16384, 32], [288, 1, 96, 32] 2025-09-07T14:59:07.9330742Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:07.9331076Z convolution 0.0922 ms 100.0% 2025-09-07T14:59:07.9331968Z triton_convolution2d_9 0.1034 ms 89.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.9333459Z triton_convolution2d_10 0.1065 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:07.9334955Z triton_convolution2d_11 0.1085 ms 84.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.9336435Z triton_convolution2d_7 0.1147 ms 80.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:07.9338109Z triton_convolution2d_12 0.1167 ms 78.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:07.9339601Z triton_convolution2d_6 0.2017 ms 45.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:07.9341070Z triton_convolution2d_8 0.3297 ms 28.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:07.9342246Z SingleProcess AUTOTUNE benchmarking takes 0.2386 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:08.1338577Z Autotune Choices Stats: 2025-09-07T14:59:08.1340332Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06758400052785873, "best_triton_pos": 1, "best_triton_time": 0.08191999793052673, "best_triton_kernel": "triton_convolution2d_33", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:08.1594437Z AUTOTUNE convolution(4x32x192x256, 64x32x3x3) 2025-09-07T14:59:08.1594836Z strides: [1572864, 1, 8192, 32], [288, 1, 96, 32] 2025-09-07T14:59:08.1595187Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:08.1595516Z convolution 0.0676 ms 100.0% 2025-09-07T14:59:08.1596405Z triton_convolution2d_33 0.0819 ms 82.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.1597903Z triton_convolution2d_36 0.0942 ms 71.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.1607773Z triton_convolution2d_31 0.0963 ms 70.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.1609389Z triton_convolution2d_34 0.0983 ms 68.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.1610944Z triton_convolution2d_35 0.0993 ms 68.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.1612433Z triton_convolution2d_30 0.1935 ms 34.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.1613944Z triton_convolution2d_32 0.2130 ms 31.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:08.1615126Z SingleProcess AUTOTUNE benchmarking takes 0.2256 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:08.3507238Z Autotune Choices Stats: 2025-09-07T14:59:08.3508874Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06348799914121628, "best_triton_pos": 1, "best_triton_time": 0.07168000191450119, "best_triton_kernel": "triton_convolution2d_40", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:08.3763068Z AUTOTUNE convolution(4x64x192x256, 128x64x3x3) 2025-09-07T14:59:08.3763500Z strides: [3145728, 1, 16384, 64], [576, 1, 192, 64] 2025-09-07T14:59:08.3763864Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:08.3764201Z convolution 0.0635 ms 100.0% 2025-09-07T14:59:08.3765125Z triton_convolution2d_40 0.0717 ms 88.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.3766643Z triton_convolution2d_43 0.0870 ms 72.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.3768135Z triton_convolution2d_42 0.0901 ms 70.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.3769612Z triton_convolution2d_41 0.0952 ms 66.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.3771300Z triton_convolution2d_38 0.1034 ms 61.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.3772803Z triton_convolution2d_37 0.1352 ms 47.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.3774300Z triton_convolution2d_39 0.2673 ms 23.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:08.3775506Z SingleProcess AUTOTUNE benchmarking takes 0.2160 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:08.5626911Z Autotune Choices Stats: 2025-09-07T14:59:08.5628718Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.05222399905323982, "best_triton_pos": 1, "best_triton_time": 0.06143999844789505, "best_triton_kernel": "triton_convolution2d_65", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:08.5881241Z AUTOTUNE convolution(4x64x96x128, 128x64x3x3) 2025-09-07T14:59:08.5881664Z strides: [786432, 1, 8192, 64], [576, 1, 192, 64] 2025-09-07T14:59:08.5882028Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:08.5882355Z convolution 0.0522 ms 100.0% 2025-09-07T14:59:08.5883418Z triton_convolution2d_65 0.0614 ms 85.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.5884923Z triton_convolution2d_68 0.0686 ms 76.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.5886402Z triton_convolution2d_67 0.0809 ms 64.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.5887891Z triton_convolution2d_63 0.0891 ms 58.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.5889491Z triton_convolution2d_66 0.0911 ms 57.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.5890977Z triton_convolution2d_62 0.1321 ms 39.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.5892463Z triton_convolution2d_64 0.2314 ms 22.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:08.5893626Z SingleProcess AUTOTUNE benchmarking takes 0.2110 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:08.7742968Z Autotune Choices Stats: 2025-09-07T14:59:08.7744623Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.048128001391887665, "best_triton_pos": 1, "best_triton_time": 0.06451199948787689, "best_triton_kernel": "triton_convolution2d_97", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:08.7991351Z AUTOTUNE convolution(4x128x96x128, 256x128x3x3) 2025-09-07T14:59:08.7991756Z strides: [1572864, 1, 16384, 128], [1152, 1, 384, 128] 2025-09-07T14:59:08.7992137Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:08.7992472Z convolution 0.0481 ms 100.0% 2025-09-07T14:59:08.7993383Z triton_convolution2d_97 0.0645 ms 74.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.7994889Z triton_convolution2d_100 0.0666 ms 72.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.7996387Z triton_convolution2d_99 0.0676 ms 71.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:08.7997950Z triton_convolution2d_98 0.0819 ms 58.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.7999508Z triton_convolution2d_95 0.0829 ms 58.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.8001001Z triton_convolution2d_94 0.1044 ms 46.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:08.8002494Z triton_convolution2d_96 0.2447 ms 19.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:08.8003868Z SingleProcess AUTOTUNE benchmarking takes 0.2032 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:08.9753095Z Autotune Choices Stats: 2025-09-07T14:59:08.9754742Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04198399931192398, "best_triton_pos": 1, "best_triton_time": 0.058368001133203506, "best_triton_kernel": "triton_convolution2d_122", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T14:59:09.0001570Z AUTOTUNE convolution(4x128x48x64, 256x128x3x3) 2025-09-07T14:59:09.0001971Z strides: [393216, 1, 8192, 128], [1152, 1, 384, 128] 2025-09-07T14:59:09.0002331Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:09.0002662Z convolution 0.0420 ms 100.0% 2025-09-07T14:59:09.0003738Z triton_convolution2d_122 0.0584 ms 71.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.0005254Z triton_convolution2d_125 0.0625 ms 67.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.0006755Z triton_convolution2d_124 0.0635 ms 66.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.0008244Z triton_convolution2d_123 0.0758 ms 55.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.0009813Z triton_convolution2d_120 0.0799 ms 52.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.0011310Z triton_convolution2d_119 0.1014 ms 41.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.0012807Z triton_convolution2d_121 0.2324 ms 18.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:09.0014006Z SingleProcess AUTOTUNE benchmarking takes 0.2003 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:09.2537267Z Autotune Choices Stats: 2025-09-07T14:59:09.2539092Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.043007999658584595, "best_triton_pos": 1, "best_triton_time": 0.07680000364780426, "best_triton_kernel": "triton_convolution2d_305", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:59:09.2786690Z AUTOTUNE convolution(4x256x48x64, 512x256x3x3) 2025-09-07T14:59:09.2787098Z strides: [786432, 1, 16384, 256], [2304, 1, 768, 256] 2025-09-07T14:59:09.2787458Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:09.2787787Z convolution 0.0430 ms 100.0% 2025-09-07T14:59:09.2788668Z triton_convolution2d_305 0.0768 ms 56.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.2790172Z triton_convolution2d_304 0.1075 ms 40.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.2791679Z triton_convolution2d_307 0.1096 ms 39.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.2793173Z triton_convolution2d_306 0.1116 ms 38.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.2794644Z triton_convolution2d_302 0.1423 ms 30.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.2796226Z triton_convolution2d_301 0.1454 ms 29.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.2797724Z triton_convolution2d_303 0.2499 ms 17.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:09.2798910Z SingleProcess AUTOTUNE benchmarking takes 0.2289 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:09.4799153Z Autotune Choices Stats: 2025-09-07T14:59:09.4800806Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04095999896526337, "best_triton_pos": 1, "best_triton_time": 0.07372800260782242, "best_triton_kernel": "triton_convolution2d_330", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:59:09.5048539Z AUTOTUNE convolution(4x256x24x32, 512x256x3x3) 2025-09-07T14:59:09.5048949Z strides: [196608, 1, 8192, 256], [2304, 1, 768, 256] 2025-09-07T14:59:09.5049501Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:09.5049824Z convolution 0.0410 ms 100.0% 2025-09-07T14:59:09.5050708Z triton_convolution2d_330 0.0737 ms 55.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.5052229Z triton_convolution2d_329 0.1024 ms 40.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.5053729Z triton_convolution2d_332 0.1055 ms 38.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.5055306Z triton_convolution2d_331 0.1096 ms 37.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.5056805Z triton_convolution2d_327 0.1393 ms 29.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.5058465Z triton_convolution2d_326 0.1423 ms 28.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.5059955Z triton_convolution2d_328 0.2355 ms 17.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:09.5061142Z SingleProcess AUTOTUNE benchmarking takes 0.2254 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:09.8153181Z Autotune Choices Stats: 2025-09-07T14:59:09.8154817Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.050175998359918594, "best_triton_pos": 1, "best_triton_time": 0.13516800105571747, "best_triton_kernel": "triton_convolution2d_512", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:59:09.8403372Z AUTOTUNE convolution(4x512x24x32, 1024x512x3x3) 2025-09-07T14:59:09.8403813Z strides: [393216, 1, 16384, 512], [4608, 1, 1536, 512] 2025-09-07T14:59:09.8404322Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:09.8404636Z convolution 0.0502 ms 100.0% 2025-09-07T14:59:09.8405528Z triton_convolution2d_512 0.1352 ms 37.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.8407028Z triton_convolution2d_511 0.2058 ms 24.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.8408525Z triton_convolution2d_513 0.2120 ms 23.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.8410014Z triton_convolution2d_514 0.2161 ms 23.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:09.8411505Z triton_convolution2d_508 0.2724 ms 18.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.8413063Z triton_convolution2d_509 0.2785 ms 18.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:09.8414561Z triton_convolution2d_510 0.4342 ms 11.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:09.8415744Z SingleProcess AUTOTUNE benchmarking takes 0.2855 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:10.1012368Z Autotune Choices Stats: 2025-09-07T14:59:10.1014112Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.048128001391887665, "best_triton_pos": 1, "best_triton_time": 0.13414399325847626, "best_triton_kernel": "triton_convolution2d_537", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T14:59:10.1263096Z AUTOTUNE convolution(4x512x12x16, 1024x512x3x3) 2025-09-07T14:59:10.1263588Z strides: [98304, 1, 8192, 512], [4608, 1, 1536, 512] 2025-09-07T14:59:10.1263957Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T14:59:10.1264282Z convolution 0.0481 ms 100.0% 2025-09-07T14:59:10.1265151Z triton_convolution2d_537 0.1341 ms 35.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:10.1266662Z triton_convolution2d_536 0.1956 ms 24.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:10.1268172Z triton_convolution2d_538 0.2079 ms 23.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:10.1269666Z triton_convolution2d_539 0.2120 ms 22.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T14:59:10.1271168Z triton_convolution2d_533 0.2683 ms 17.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:10.1272739Z triton_convolution2d_534 0.2734 ms 17.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T14:59:10.1274237Z triton_convolution2d_535 0.4065 ms 11.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T14:59:10.1275416Z SingleProcess AUTOTUNE benchmarking takes 0.2851 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T14:59:10.4933469Z Autotune Choices Stats: 2025-09-07T14:59:10.4934638Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_712", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013311999849975109, "best_triton_pos": 0} 2025-09-07T14:59:10.5198611Z AUTOTUNE addmm(768x255, 768x1024, 1024x255) 2025-09-07T14:59:10.5198982Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T14:59:10.5199359Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:10.5200394Z triton_mm_712 0.0133 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:10.5201608Z triton_mm_710 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:10.5203015Z triton_mm_716 0.0154 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:10.5204209Z triton_mm_711 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:10.5205405Z triton_mm_709 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:10.5206603Z triton_mm_715 0.0174 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:10.5207875Z triton_mm_718 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:10.5209125Z triton_mm_719 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:10.5210313Z triton_mm_721 0.0215 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:10.5211508Z triton_mm_717 0.0276 ms 48.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:10.5212537Z SingleProcess AUTOTUNE benchmarking takes 0.3435 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:10.8029532Z Autotune Choices Stats: 2025-09-07T14:59:10.8031025Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "bias_addmm", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.011264000087976456, "best_triton_kernel": "triton_mm_728", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T14:59:10.8287492Z AUTOTUNE addmm(768x256, 768x512, 512x256) 2025-09-07T14:59:10.8287834Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T14:59:10.8288376Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:10.8288764Z bias_addmm 0.0113 ms 100.0% 2025-09-07T14:59:10.8289500Z triton_mm_728 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:10.8290699Z triton_mm_729 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:10.8291884Z triton_mm_730 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:10.8293085Z triton_mm_734 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:10.8294276Z triton_mm_727 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:10.8295464Z triton_mm_733 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:10.8296718Z triton_mm_737 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:10.8297999Z triton_mm_736 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:10.8299192Z triton_mm_739 0.0143 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:10.8300208Z SingleProcess AUTOTUNE benchmarking takes 0.3083 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T14:59:11.1501344Z Autotune Choices Stats: 2025-09-07T14:59:11.1502682Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_826", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.01740800030529499, "best_triton_pos": 0} 2025-09-07T14:59:11.1766759Z AUTOTUNE addmm(3072x255, 3072x512, 512x255) 2025-09-07T14:59:11.1768043Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T14:59:11.1768408Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:11.1769240Z triton_mm_826 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:11.1770445Z triton_mm_829 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.1771650Z triton_mm_832 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.1772857Z triton_mm_830 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.1774030Z triton_mm_828 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.1775223Z triton_mm_831 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.1776495Z triton_mm_827 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:11.1777812Z triton_mm_836 0.0225 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:11.1779012Z triton_mm_820 0.0236 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:11.1780192Z triton_mm_821 0.0236 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:11.1781216Z SingleProcess AUTOTUNE benchmarking takes 0.3254 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:11.4554925Z Autotune Choices Stats: 2025-09-07T14:59:11.4556105Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_844", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T14:59:11.4834135Z AUTOTUNE addmm(3072x128, 3072x256, 256x128) 2025-09-07T14:59:11.4834632Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:59:11.4835015Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:11.4835843Z triton_mm_844 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:11.4837054Z triton_mm_845 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T14:59:11.4838251Z triton_mm_838 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:11.4839438Z triton_mm_839 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:11.4840688Z triton_mm_840 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:11.4841878Z triton_mm_847 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.4843365Z triton_mm_848 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.4844544Z triton_mm_850 0.0123 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.4845302Z bias_addmm 0.0133 ms 84.6% 2025-09-07T14:59:11.4846038Z triton_mm_846 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.4847060Z SingleProcess AUTOTUNE benchmarking takes 0.3062 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:11.8340235Z Autotune Choices Stats: 2025-09-07T14:59:11.8341415Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_945", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.02969600073993206, "best_triton_pos": 0} 2025-09-07T14:59:11.8609046Z AUTOTUNE addmm(12288x255, 12288x256, 256x255) 2025-09-07T14:59:11.8609597Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T14:59:11.8609958Z dtypes: torch.bfloat16, torch.bfloat16, torch.bfloat16 2025-09-07T14:59:11.8610791Z triton_mm_945 0.0297 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.8612003Z triton_mm_937 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T14:59:11.8613186Z triton_mm_941 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.8614371Z triton_mm_943 0.0307 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.8615557Z triton_mm_940 0.0317 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T14:59:11.8616725Z triton_mm_936 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T14:59:11.8618049Z triton_mm_942 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.8619256Z triton_mm_946 0.0328 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.8620446Z triton_mm_939 0.0338 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T14:59:11.8621639Z triton_mm_947 0.0358 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T14:59:11.8622679Z SingleProcess AUTOTUNE benchmarking takes 0.3552 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T14:59:11.8791214Z cudagraph partition due to non gpu ops 2025-09-07T14:59:11.8791663Z cudagraph partition due to non gpu ops 2025-09-07T14:59:11.8792017Z cudagraph partition due to non gpu ops 2025-09-07T14:59:11.9440333Z cudagraph partition into 2 partitions 2025-09-07T14:59:32.5179451Z pass 2025-09-07T14:59:35.4641267Z accuracy pass_rate=88.89% 2025-09-07T14:59:35.4647671Z calls_captured gmean=0.00x mean=400.389x 2025-09-07T14:59:35.4652010Z unique_graphs gmean=0.00x mean=2.389x 2025-09-07T14:59:35.4656128Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T14:59:35.4660520Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T14:59:35.4664596Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T14:59:35.4668922Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T14:59:35.4672982Z cudagraph_skips gmean=nanx mean=-0.333x 2025-09-07T14:59:35.4674262Z compilation_latency mean=30.899 seconds 2025-09-07T14:59:36.3228730Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *aotinductor-true* ]] 2025-09-07T14:59:36.3230201Z + [[ inference == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T14:59:36.3230530Z + [[ accuracy == \a\c\c\u\r\a\c\y ]] 2025-09-07T14:59:36.3232042Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --export --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_export_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T14:59:36.8830979Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:59:36.8832471Z import pynvml # type: ignore[import] 2025-09-07T14:59:40.4451172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:59:40.4452594Z import pynvml # type: ignore[import] 2025-09-07T14:59:43.0303101Z 2025-09-07T14:59:44.0155219Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:59:44.0155603Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:59:44.0157757Z cuda eval soft_actor_critic 2025-09-07T14:59:48.4418124Z pass 2025-09-07T14:59:50.0440499Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:59:50.0441902Z import pynvml # type: ignore[import] 2025-09-07T14:59:52.6566876Z 2025-09-07T14:59:54.3782815Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:59:54.3783189Z loading model: 0it [00:01, ?it/s] 2025-09-07T14:59:54.3848252Z cuda eval speech_transformer 2025-09-07T14:59:54.8283680Z ERROR:common: 2025-09-07T14:59:54.8283999Z Traceback (most recent call last): 2025-09-07T14:59:54.8284583Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 2320, in check_accuracy 2025-09-07T14:59:54.8285180Z optimized_model_iter_fn = optimize_ctx( 2025-09-07T14:59:54.8285708Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/common.py", line 1523, in export 2025-09-07T14:59:54.8286268Z ep = torch.export.export( 2025-09-07T14:59:54.8286844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py", line 311, in export 2025-09-07T14:59:54.8287426Z raise e 2025-09-07T14:59:54.8287921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py", line 277, in export 2025-09-07T14:59:54.8288500Z return _export( 2025-09-07T14:59:54.8289336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1163, in wrapper 2025-09-07T14:59:54.8290042Z raise e 2025-09-07T14:59:54.8290532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1129, in wrapper 2025-09-07T14:59:54.8291115Z ep = fn(*args, **kwargs) 2025-09-07T14:59:54.8291715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/exported_program.py", line 124, in wrapper 2025-09-07T14:59:54.8292357Z return fn(*args, **kwargs) 2025-09-07T14:59:54.8292916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 2255, in _export 2025-09-07T14:59:54.8293500Z ep = _export_for_training( 2025-09-07T14:59:54.8294061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1163, in wrapper 2025-09-07T14:59:54.8294636Z raise e 2025-09-07T14:59:54.8295134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1129, in wrapper 2025-09-07T14:59:54.8295719Z ep = fn(*args, **kwargs) 2025-09-07T14:59:54.8296309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/exported_program.py", line 124, in wrapper 2025-09-07T14:59:54.8296947Z return fn(*args, **kwargs) 2025-09-07T14:59:54.8297658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 2071, in _export_for_training 2025-09-07T14:59:54.8298328Z export_artifact = export_func( 2025-09-07T14:59:54.8299041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1415, in _strict_export 2025-09-07T14:59:54.8299681Z gm_torch_level = _export_to_torch_ir( 2025-09-07T14:59:54.8300336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 812, in _export_to_torch_ir 2025-09-07T14:59:54.8301000Z gm_torch_level, _ = torch._dynamo.export( 2025-09-07T14:59:54.8301623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 2002, in inner 2025-09-07T14:59:54.8302231Z result_traced = opt_f(*args, **kwargs) 2025-09-07T14:59:54.8302851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 414, in __call__ 2025-09-07T14:59:54.8303479Z return super().__call__(*args, **kwargs) 2025-09-07T14:59:54.8304163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T14:59:54.8304853Z return self._call_impl(*args, **kwargs) 2025-09-07T14:59:54.8305478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T14:59:54.8306118Z return forward_call(*args, **kwargs) 2025-09-07T14:59:54.8306769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 841, in compile_wrapper 2025-09-07T14:59:54.8307673Z raise e.with_traceback(None) from e.__cause__ # User compiler error 2025-09-07T14:59:54.8308225Z torch._dynamo.exc.Unsupported: Dynamic slicing with Tensor arguments 2025-09-07T14:59:54.8308980Z Explanation: Creating slices with Tensor arguments is not supported. e.g. `l[:x]`, where `x` is a 1-element tensor. 2025-09-07T14:59:54.8310157Z Hint: It may be possible to write Dynamo tracing rules for this code. Please report an issue to PyTorch if you encounter this graph break often and it is causing performance issues. 2025-09-07T14:59:54.8310919Z 2025-09-07T14:59:54.8311466Z Developer debug context: SliceVariable start: TensorVariable(), stop: ConstantVariable(NoneType: None), step: ConstantVariable(NoneType: None) 2025-09-07T14:59:54.8312133Z 2025-09-07T14:59:54.8312605Z For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0038.html 2025-09-07T14:59:54.8313178Z 2025-09-07T14:59:54.8313287Z from user code: 2025-09-07T14:59:54.8314046Z File "/torchbench/torchbenchmark/models/speech_transformer/speech_transformer/transformer/transformer.py", line 27, in forward 2025-09-07T14:59:54.8314892Z encoder_padded_outputs, *_ = self.encoder(padded_input, input_lengths) 2025-09-07T14:59:54.8315711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T14:59:54.8316360Z return forward_call(*args, **kwargs) 2025-09-07T14:59:54.8317069Z File "/torchbench/torchbenchmark/models/speech_transformer/speech_transformer/transformer/encoder.py", line 61, in forward 2025-09-07T14:59:54.8317893Z non_pad_mask = get_non_pad_mask(padded_input, input_lengths=input_lengths) 2025-09-07T14:59:54.8318736Z File "/torchbench/torchbenchmark/models/speech_transformer/speech_transformer/utils/utils.py", line 108, in get_non_pad_mask 2025-09-07T14:59:54.8319473Z non_pad_mask[i, input_lengths[i] :] = 0 2025-09-07T14:59:54.8319699Z 2025-09-07T14:59:54.8320335Z Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo" 2025-09-07T14:59:54.8321070Z 2025-09-07T14:59:54.8321306Z TorchDynamo optimized model failed to run because of following error 2025-09-07T14:59:54.8518852Z fail_to_run 2025-09-07T14:59:56.3645412Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T14:59:56.3648601Z import pynvml # type: ignore[import] 2025-09-07T14:59:58.9690970Z 2025-09-07T14:59:59.9511553Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:59:59.9511942Z loading model: 0it [00:00, ?it/s] 2025-09-07T14:59:59.9527896Z cuda eval squeezenet1_1 2025-09-07T15:00:04.8324076Z pass 2025-09-07T15:00:06.5434938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:00:06.5436363Z import pynvml # type: ignore[import] 2025-09-07T15:00:09.1214231Z 2025-09-07T15:00:10.8989641Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:00:10.8989932Z 2025-09-07T15:00:11.2024566Z Loading pipeline components...: 0% 0/6 [00:00", line 1, in 2025-09-07T15:09:13.4747156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1593, in wrapped_fn 2025-09-07T15:09:13.4747761Z return tuple(flat_fn(*args)) 2025-09-07T15:09:13.4748403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/utils.py", line 187, in flat_fn 2025-09-07T15:09:13.4749067Z tree_out = fn(*args, **kwargs) 2025-09-07T15:09:13.4749836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_capture_wrappers.py", line 1354, in functional_call 2025-09-07T15:09:13.4750637Z out = mod(*args[params_len:], **kwargs) 2025-09-07T15:09:13.4751293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T15:09:13.4752005Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T15:09:13.4752745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T15:09:13.4753578Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T15:09:13.4754265Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T15:09:13.4754899Z ret_val = forward(*args, **kwargs) 2025-09-07T15:09:13.4755507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T15:09:13.4756155Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T15:09:13.4756860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T15:09:13.4757534Z return self._call_impl(*args, **kwargs) 2025-09-07T15:09:13.4758167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T15:09:13.4758802Z return forward_call(*args, **kwargs) 2025-09-07T15:09:13.4759392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1906, in forward 2025-09-07T15:09:13.4759991Z tree_out = mod(*args, **kwargs) 2025-09-07T15:09:13.4760631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T15:09:13.4761348Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T15:09:13.4762150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T15:09:13.4763051Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T15:09:13.4763743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T15:09:13.4764377Z ret_val = forward(*args, **kwargs) 2025-09-07T15:09:13.4764991Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T15:09:13.4765635Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T15:09:13.4766339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T15:09:13.4767017Z return self._call_impl(*args, **kwargs) 2025-09-07T15:09:13.4767653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T15:09:13.4768294Z return forward_call(*args, **kwargs) 2025-09-07T15:09:13.4768869Z File "/torchbench/torchbenchmark/models/tts_angular/model.py", line 73, in forward 2025-09-07T15:09:13.4769380Z d = self.layers(x) 2025-09-07T15:09:13.4770089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T15:09:13.4770805Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T15:09:13.4771548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T15:09:13.4772324Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T15:09:13.4773011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T15:09:13.4773648Z ret_val = forward(*args, **kwargs) 2025-09-07T15:09:13.4774254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T15:09:13.4774899Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T15:09:13.4775601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T15:09:13.4776282Z return self._call_impl(*args, **kwargs) 2025-09-07T15:09:13.4776920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T15:09:13.4777662Z return forward_call(*args, **kwargs) 2025-09-07T15:09:13.4778291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward 2025-09-07T15:09:13.4778969Z input = module(input) 2025-09-07T15:09:13.4779590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T15:09:13.4780304Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T15:09:13.4781055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T15:09:13.4781824Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T15:09:13.4782506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T15:09:13.4783136Z ret_val = forward(*args, **kwargs) 2025-09-07T15:09:13.4783743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T15:09:13.4784388Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T15:09:13.4785093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T15:09:13.4785765Z return self._call_impl(*args, **kwargs) 2025-09-07T15:09:13.4786401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T15:09:13.4787097Z return forward_call(*args, **kwargs) 2025-09-07T15:09:13.4787610Z File "/torchbench/torchbenchmark/models/tts_angular/model.py", line 18, in forward 2025-09-07T15:09:13.4788105Z o, (_, _) = self.lstm(x) 2025-09-07T15:09:13.4788730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T15:09:13.4789454Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T15:09:13.4790197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T15:09:13.4790970Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T15:09:13.4791653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T15:09:13.4792288Z ret_val = forward(*args, **kwargs) 2025-09-07T15:09:13.4792894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T15:09:13.4793596Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T15:09:13.4794294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T15:09:13.4795013Z return self._call_impl(*args, **kwargs) 2025-09-07T15:09:13.4795648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T15:09:13.4796284Z return forward_call(*args, **kwargs) 2025-09-07T15:09:13.4796881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 1046, in forward 2025-09-07T15:09:13.4797468Z self._update_flat_weights() 2025-09-07T15:09:13.4798106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 393, in _update_flat_weights 2025-09-07T15:09:13.4798757Z self._init_flat_weights() 2025-09-07T15:09:13.4799370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 215, in _init_flat_weights 2025-09-07T15:09:13.4800028Z self.flatten_parameters() 2025-09-07T15:09:13.4800631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 255, in flatten_parameters 2025-09-07T15:09:13.4801271Z unique_data_ptrs = { 2025-09-07T15:09:13.4801818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 256, in 2025-09-07T15:09:13.4802441Z p.data_ptr() # type: ignore[union-attr] 2025-09-07T15:09:13.4803296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1409, in __torch_function__ 2025-09-07T15:09:13.4804087Z return func(*args, **kwargs) 2025-09-07T15:09:13.4804785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1479, in __torch_function__ 2025-09-07T15:09:13.4805504Z return func(*args, **kwargs) 2025-09-07T15:09:13.4806169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 1066, in __torch_function__ 2025-09-07T15:09:13.4806843Z return func(*args, **kwargs) 2025-09-07T15:09:13.4808459Z RuntimeError: Cannot access data pointer of Tensor (e.g. FakeTensor, FunctionalTensor). If you're using torch.compile/export/fx, it is likely that we are erroneously tracing into a custom kernel. To fix this, please wrap the custom kernel into an opaque custom op. Please see the following for details: https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-09-07T15:09:13.4810269Z TorchDynamo optimized model failed to run because of following error 2025-09-07T15:09:13.4952734Z fail_to_run 2025-09-07T15:09:15.0376401Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:09:15.0377883Z import pynvml # type: ignore[import] 2025-09-07T15:09:17.6390199Z 2025-09-07T15:09:20.0360528Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:09:20.0360929Z loading model: 0it [00:02, ?it/s] 2025-09-07T15:09:20.0369473Z cuda eval vgg16 2025-09-07T15:09:37.4949498Z pass 2025-09-07T15:09:40.2913857Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:09:40.2915298Z import pynvml # type: ignore[import] 2025-09-07T15:09:42.8886034Z 2025-09-07T15:09:44.9236955Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:09:44.9237345Z loading model: 0it [00:02, ?it/s] 2025-09-07T15:09:44.9330413Z cuda eval yolov3 2025-09-07T15:10:14.3584980Z pass 2025-09-07T15:10:17.0499237Z accuracy pass_rate=78.57% 2025-09-07T15:10:17.0500228Z calls_captured gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0504642Z unique_graphs gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0508933Z graph_breaks gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0513188Z unique_graph_breaks gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0517263Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0521346Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0525894Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T15:10:17.0527000Z compilation_latency mean=0.000 seconds 2025-09-07T15:10:17.8713669Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *maxautotune-true* ]] 2025-09-07T15:10:17.8715119Z + TORCHINDUCTOR_MAX_AUTOTUNE=1 2025-09-07T15:10:17.8716612Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --bfloat16 --backend inductor --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_accuracy.csv 2025-09-07T15:10:18.4496026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:10:18.4500034Z import pynvml # type: ignore[import] 2025-09-07T15:10:21.9771854Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:10:21.9773301Z import pynvml # type: ignore[import] 2025-09-07T15:10:24.5883846Z 2025-09-07T15:10:25.5587871Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:10:25.5588271Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:10:25.5590999Z cuda eval soft_actor_critic 2025-09-07T15:10:31.7604674Z pass 2025-09-07T15:10:34.3084822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:10:34.3086254Z import pynvml # type: ignore[import] 2025-09-07T15:10:36.9194584Z 2025-09-07T15:10:38.6397629Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:10:38.6398020Z loading model: 0it [00:01, ?it/s] 2025-09-07T15:10:38.6465519Z cuda eval speech_transformer 2025-09-07T15:10:47.4350887Z W0907 15:10:47.434000 657334 site-packages/torch/_inductor/utils.py:2298] [7/0_1] DeviceCopy in input program 2025-09-07T15:10:51.9360725Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9361151Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9361497Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9361846Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9362185Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9362523Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9363076Z cudagraph partition due to non gpu ops 2025-09-07T15:10:51.9363415Z cudagraph partition due to DeviceCopy ops 2025-09-07T15:10:51.9669696Z cudagraph partition into 2 partitions 2025-09-07T15:11:07.4811714Z W0907 15:11:07.480000 657334 site-packages/torch/_inductor/utils.py:2298] [13/0_1] DeviceCopy in input program 2025-09-07T15:11:13.9574259Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9574696Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9575040Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9575401Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9575980Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9576328Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9576685Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9577139Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9577519Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9577860Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9578198Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9578536Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9578864Z cudagraph partition due to non gpu ops 2025-09-07T15:11:13.9579207Z cudagraph partition due to DeviceCopy ops 2025-09-07T15:11:14.0062643Z cudagraph partition into 2 partitions 2025-09-07T15:11:16.7201288Z pass 2025-09-07T15:11:19.8825876Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:11:19.8827301Z import pynvml # type: ignore[import] 2025-09-07T15:11:22.4824143Z 2025-09-07T15:11:23.4648245Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:11:23.4648636Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:11:23.4664559Z cuda eval squeezenet1_1 2025-09-07T15:11:37.7332727Z pass 2025-09-07T15:11:40.4758352Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:11:40.4762529Z import pynvml # type: ignore[import] 2025-09-07T15:11:43.0772001Z 2025-09-07T15:11:44.8759925Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:11:44.8760206Z 2025-09-07T15:11:45.3334633Z Loading pipeline components...: 0% 0/6 [00:00 2025-09-07T15:20:15.9950621Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9951028Z 2025-09-07T15:20:15.9951223Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:20:15.9951946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:20:15.9952662Z anchors = self.anchor_generator(images, features) 2025-09-07T15:20:15.9953410Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:20:15.9954164Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:20:15.9954954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:20:15.9955978Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9956887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:20:15.9957789Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9958194Z 2025-09-07T15:20:15.9958390Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:20:15.9959125Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:20:15.9959823Z anchors = self.anchor_generator(images, features) 2025-09-07T15:20:15.9960572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:20:15.9961329Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:20:15.9962124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:20:15.9963319Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9964214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:20:15.9965121Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9965537Z 2025-09-07T15:20:15.9965734Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:20:15.9966468Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:20:15.9967177Z anchors = self.anchor_generator(images, features) 2025-09-07T15:20:15.9967930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:20:15.9968674Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:20:15.9969462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:20:15.9970396Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9971393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:20:15.9972356Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9972812Z 2025-09-07T15:20:15.9972991Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:20:15.9973720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:20:15.9974432Z anchors = self.anchor_generator(images, features) 2025-09-07T15:20:15.9975193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:20:15.9975956Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:20:15.9976738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:20:15.9977754Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9978671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:20:15.9979580Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:20:15.9979986Z 2025-09-07T15:20:16.0129957Z cudagraph partition into 2 partitions 2025-09-07T15:20:18.7209011Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T15:20:18.7210512Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T15:20:18.7211404Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] or: 2025-09-07T15:20:18.7212287Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T15:20:18.7213341Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] to include these operations in the captured graph. 2025-09-07T15:20:18.7214219Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:20:18.7215031Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break: from user code at: 2025-09-07T15:20:18.7216550Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T15:20:18.7218472Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T15:20:18.7220055Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T15:20:18.7221536Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T15:20:18.7222931Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T15:20:18.7224247Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T15:20:18.7225734Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T15:20:18.7227263Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:20:18.7228770Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T15:20:18.7230180Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:20:18.7231612Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T15:20:18.7232995Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T15:20:18.7233921Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:20:18.7234610Z W0907 15:20:18.720000 681440 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:20:30.8114840Z pass 2025-09-07T15:20:34.6112458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:20:34.6118183Z import pynvml # type: ignore[import] 2025-09-07T15:20:37.2156651Z 2025-09-07T15:20:39.2569612Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:20:39.2570009Z loading model: 0it [00:02, ?it/s] 2025-09-07T15:20:39.2665661Z cuda eval yolov3 2025-09-07T15:21:01.2609436Z Autotune Choices Stats: 2025-09-07T15:21:01.2610662Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_23", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.03686400130391121, "best_triton_pos": 0} 2025-09-07T15:21:01.2904574Z AUTOTUNE mm(196608x64, 64x32) 2025-09-07T15:21:01.2904900Z strides: [64, 1], [1, 64] 2025-09-07T15:21:01.2905196Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:01.2905968Z triton_mm_23 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.2907168Z triton_mm_24 0.0369 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T15:21:01.2908346Z triton_mm_14 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:01.2909516Z triton_mm_17 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:01.2910682Z triton_mm_20 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:01.2911863Z triton_mm_28 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.2913042Z triton_mm_29 0.0379 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:01.2914460Z triton_mm_21 0.0410 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.2915742Z triton_mm_25 0.0420 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.2916987Z triton_mm_27 0.0420 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T15:21:01.2918014Z SingleProcess AUTOTUNE benchmarking takes 0.3253 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T15:21:01.8886830Z Autotune Choices Stats: 2025-09-07T15:21:01.8888082Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_60", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.023552000522613525, "best_triton_pos": 0} 2025-09-07T15:21:01.9182325Z AUTOTUNE mm(49152x128, 128x64) 2025-09-07T15:21:01.9182649Z strides: [128, 1], [1, 128] 2025-09-07T15:21:01.9182965Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:01.9183752Z triton_mm_60 0.0236 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.9184940Z triton_mm_51 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:01.9186293Z triton_mm_55 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.9187474Z triton_mm_57 0.0246 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.9188641Z triton_mm_52 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:01.9189810Z triton_mm_54 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:01.9190985Z triton_mm_56 0.0256 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T15:21:01.9192153Z triton_mm_50 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:01.9193320Z triton_mm_53 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:01.9194495Z triton_mm_58 0.0266 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:01.9195525Z SingleProcess AUTOTUNE benchmarking takes 0.3078 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T15:21:02.5190507Z Autotune Choices Stats: 2025-09-07T15:21:02.5191730Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_117", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016383999958634377, "best_triton_pos": 0} 2025-09-07T15:21:02.5473614Z AUTOTUNE mm(12288x256, 256x128) 2025-09-07T15:21:02.5473929Z strides: [256, 1], [1, 256] 2025-09-07T15:21:02.5474228Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:02.5475309Z triton_mm_117 0.0164 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:02.5476613Z triton_mm_110 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:02.5477816Z triton_mm_111 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:02.5479007Z triton_mm_112 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:02.5480186Z triton_mm_113 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:02.5481373Z triton_mm_114 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:02.5482561Z triton_mm_116 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:02.5483951Z triton_mm_118 0.0174 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:02.5484804Z mm 0.0184 ms 88.9% 2025-09-07T15:21:02.5485495Z triton_mm_107 0.0184 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:02.5486521Z SingleProcess AUTOTUNE benchmarking takes 0.3018 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:03.3609956Z Autotune Choices Stats: 2025-09-07T15:21:03.3611185Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_866", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.020479999482631683, "best_triton_pos": 0} 2025-09-07T15:21:03.3894851Z AUTOTUNE mm(12288x384, 384x128) 2025-09-07T15:21:03.3895162Z strides: [384, 1], [1, 384] 2025-09-07T15:21:03.3895466Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:03.3896249Z triton_mm_866 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.3897519Z triton_mm_871 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.3898741Z triton_mm_872 0.0205 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:03.3899491Z mm 0.0215 ms 95.2% 2025-09-07T15:21:03.3900182Z triton_mm_864 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.3901365Z triton_mm_865 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:03.3902537Z triton_mm_867 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.3903954Z triton_mm_868 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:03.3905241Z triton_mm_870 0.0225 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.3906553Z triton_mm_862 0.0246 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:03.3907586Z SingleProcess AUTOTUNE benchmarking takes 0.3034 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:03.7615089Z Autotune Choices Stats: 2025-09-07T15:21:03.7616636Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.014336000196635723, "best_triton_kernel": "triton_mm_319", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T15:21:03.7896719Z AUTOTUNE mm(3072x512, 512x256) 2025-09-07T15:21:03.7897103Z strides: [512, 1], [1, 512] 2025-09-07T15:21:03.7897467Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:03.7897775Z mm 0.0143 ms 100.0% 2025-09-07T15:21:03.7898484Z triton_mm_319 0.0143 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.7899680Z triton_mm_315 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:03.7901157Z triton_mm_318 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:03.7902352Z triton_mm_321 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:03.7903554Z triton_mm_320 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.7904733Z triton_mm_325 0.0164 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:03.7905920Z triton_mm_316 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:03.7907099Z triton_mm_317 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.7908287Z triton_mm_324 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:03.7909322Z SingleProcess AUTOTUNE benchmarking takes 0.2907 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:04.6084879Z Autotune Choices Stats: 2025-09-07T15:21:04.6086367Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01740800030529499, "best_triton_pos": 1, "best_triton_time": 0.01740800030529499, "best_triton_kernel": "triton_mm_755", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T15:21:04.6366068Z AUTOTUNE mm(3072x768, 768x256) 2025-09-07T15:21:04.6366387Z strides: [768, 1], [1, 768] 2025-09-07T15:21:04.6366685Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:04.6367006Z mm 0.0174 ms 100.0% 2025-09-07T15:21:04.6368120Z triton_mm_755 0.0174 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:04.6369419Z triton_mm_751 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:04.6370593Z triton_mm_754 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:04.6371780Z triton_mm_757 0.0184 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:04.6372964Z triton_mm_752 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:04.6374156Z triton_mm_760 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:04.6375353Z triton_mm_761 0.0205 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:04.6376542Z triton_mm_753 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:04.6377905Z triton_mm_756 0.0215 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:04.6378923Z SingleProcess AUTOTUNE benchmarking takes 0.3007 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:05.0198528Z Autotune Choices Stats: 2025-09-07T15:21:05.0200025Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.014336000196635723, "best_triton_pos": 1, "best_triton_time": 0.015359999611973763, "best_triton_kernel": "triton_mm_523", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T15:21:05.0483756Z AUTOTUNE mm(768x1024, 1024x512) 2025-09-07T15:21:05.0484064Z strides: [1024, 1], [1, 1024] 2025-09-07T15:21:05.0484385Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:05.0484686Z mm 0.0143 ms 100.0% 2025-09-07T15:21:05.0485387Z triton_mm_523 0.0154 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:05.0486584Z triton_mm_522 0.0174 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:05.0487771Z triton_mm_519 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:05.0488973Z triton_mm_526 0.0184 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:05.0490171Z triton_mm_516 0.0195 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:05.0491353Z triton_mm_517 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:05.0492726Z triton_mm_518 0.0205 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:05.0494014Z triton_mm_525 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:05.0495285Z triton_mm_528 0.0215 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:05.0496308Z SingleProcess AUTOTUNE benchmarking takes 0.3016 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:05.7926741Z Autotune Choices Stats: 2025-09-07T15:21:05.7928250Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.01945599913597107, "best_triton_kernel": "triton_mm_666", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T15:21:05.8216577Z AUTOTUNE mm(768x2048, 2048x512) 2025-09-07T15:21:05.8216907Z strides: [2048, 1], [1, 2048] 2025-09-07T15:21:05.8217217Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:05.8217588Z mm 0.0195 ms 100.0% 2025-09-07T15:21:05.8218299Z triton_mm_666 0.0195 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:05.8219766Z triton_mm_665 0.0256 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:05.8220938Z triton_mm_662 0.0266 ms 73.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:05.8222134Z triton_mm_669 0.0287 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:05.8223325Z triton_mm_661 0.0317 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:05.8224506Z triton_mm_668 0.0317 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:05.8225693Z triton_mm_659 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:05.8226875Z triton_mm_660 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:05.8228070Z triton_mm_671 0.0328 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:05.8229093Z SingleProcess AUTOTUNE benchmarking takes 0.3370 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:06.1451153Z Autotune Choices Stats: 2025-09-07T15:21:06.1452350Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_844", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T15:21:06.1759227Z AUTOTUNE mm(3072x256, 256x128) 2025-09-07T15:21:06.1759539Z strides: [256, 1], [1, 256] 2025-09-07T15:21:06.1759842Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:06.1760806Z triton_mm_844 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:06.1762098Z triton_mm_840 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:06.1763587Z triton_mm_845 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:06.1764778Z triton_mm_847 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:06.1765972Z triton_mm_848 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:06.1767160Z triton_mm_850 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:06.1767892Z mm 0.0123 ms 83.3% 2025-09-07T15:21:06.1768587Z triton_mm_838 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:06.1769769Z triton_mm_839 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:06.1771051Z triton_mm_846 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:06.1772083Z SingleProcess AUTOTUNE benchmarking takes 0.2795 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:06.7101464Z Autotune Choices Stats: 2025-09-07T15:21:06.7102729Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_730", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010239999741315842, "best_triton_pos": 0} 2025-09-07T15:21:06.7387355Z AUTOTUNE mm(768x512, 512x256) 2025-09-07T15:21:06.7387678Z strides: [512, 1], [1, 512] 2025-09-07T15:21:06.7387980Z dtypes: torch.bfloat16, torch.bfloat16 2025-09-07T15:21:06.7388760Z triton_mm_730 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:06.7389534Z mm 0.0113 ms 90.9% 2025-09-07T15:21:06.7390211Z triton_mm_728 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:06.7391401Z triton_mm_729 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T15:21:06.7392591Z triton_mm_734 0.0113 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T15:21:06.7393781Z triton_mm_727 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T15:21:06.7394959Z triton_mm_733 0.0123 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T15:21:06.7396147Z triton_mm_736 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:06.7397594Z triton_mm_737 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T15:21:06.7398945Z triton_mm_739 0.0133 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T15:21:06.7399982Z SingleProcess AUTOTUNE benchmarking takes 0.2839 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T15:21:08.5050309Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:21:08.5051042Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T15:21:08.5051619Z return mod(*inputs) 2025-09-07T15:21:08.5052079Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T15:21:08.5052618Z return self.forward_once(x) 2025-09-07T15:21:08.5053137Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T15:21:08.5053707Z yolo_out.append(module(x, out)) 2025-09-07T15:21:08.5054207Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T15:21:08.5054791Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T15:21:08.5055376Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T15:21:08.5055960Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T15:21:08.5056215Z 2025-09-07T15:21:08.5056388Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:21:08.5057254Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T15:21:08.5057922Z return mod(*inputs) 2025-09-07T15:21:08.5058385Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T15:21:08.5058920Z return self.forward_once(x) 2025-09-07T15:21:08.5059460Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T15:21:08.5060001Z yolo_out.append(module(x, out)) 2025-09-07T15:21:08.5060515Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T15:21:08.5061088Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T15:21:08.5061671Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T15:21:08.5062243Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T15:21:08.5062505Z 2025-09-07T15:21:08.5062667Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:21:08.5063300Z File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 482, in forward_pass 2025-09-07T15:21:08.5063864Z return mod(*inputs) 2025-09-07T15:21:08.5064319Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 309, in forward 2025-09-07T15:21:08.5064831Z return self.forward_once(x) 2025-09-07T15:21:08.5065347Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 372, in forward_once 2025-09-07T15:21:08.5065899Z yolo_out.append(module(x, out)) 2025-09-07T15:21:08.5066410Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 246, in forward 2025-09-07T15:21:08.5066961Z self.grid = self.create_grids((nx, ny), p.device) 2025-09-07T15:21:08.5067532Z File "/torchbench/torchbenchmark/models/yolov3/yolo_models.py", line 199, in create_grids 2025-09-07T15:21:08.5068111Z self.ng = torch.tensor(ng, dtype=torch.float) 2025-09-07T15:21:08.5068363Z 2025-09-07T15:21:08.6122373Z cudagraph partition into 2 partitions 2025-09-07T15:21:24.5331876Z pass 2025-09-07T15:21:27.4740472Z accuracy pass_rate=88.89% 2025-09-07T15:21:27.4747268Z calls_captured gmean=0.00x mean=400.389x 2025-09-07T15:21:27.4751413Z unique_graphs gmean=0.00x mean=2.389x 2025-09-07T15:21:27.4755685Z graph_breaks gmean=0.00x mean=1.889x 2025-09-07T15:21:27.4759761Z unique_graph_breaks gmean=0.00x mean=0.389x 2025-09-07T15:21:27.4764622Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T15:21:27.4768443Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T15:21:27.4772486Z cudagraph_skips gmean=nanx mean=-0.333x 2025-09-07T15:21:27.4773745Z compilation_latency mean=25.099 seconds 2025-09-07T15:21:28.3429973Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs_low_precision-true* ]] 2025-09-07T15:21:28.3431504Z + [[ inference == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T15:21:28.3433018Z + python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --inference --quant --backend inductor --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_cudagraphs_low_precision_torchbench_quant_inference_cuda_accuracy.csv 2025-09-07T15:21:28.8961764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:21:28.8963388Z import pynvml # type: ignore[import] 2025-09-07T15:21:31.4475357Z usage: torchbench.py 2025-09-07T15:21:31.4475690Z [-h] 2025-09-07T15:21:31.4475918Z [--filter FILTER] 2025-09-07T15:21:31.4476178Z [--exclude EXCLUDE] 2025-09-07T15:21:31.4479936Z [--exclude-exact EXCLUDE_EXACT] 2025-09-07T15:21:31.4480329Z [--total-partitions {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}] 2025-09-07T15:21:31.4480741Z [--partition-id PARTITION_ID] 2025-09-07T15:21:31.4481042Z [--devices DEVICES] 2025-09-07T15:21:31.4481322Z [--device-index DEVICE_INDEX] 2025-09-07T15:21:31.4481631Z [--repeat REPEAT] 2025-09-07T15:21:31.4481937Z [--iterations-per-run ITERATIONS_PER_RUN] 2025-09-07T15:21:31.4482283Z [--randomize-input] 2025-09-07T15:21:31.4482554Z [--threads THREADS] 2025-09-07T15:21:31.4483069Z [--nopython] 2025-09-07T15:21:31.4483324Z [--no-skip] 2025-09-07T15:21:31.4483552Z [--prims-nvfuser] 2025-09-07T15:21:31.4483828Z [--dump-raw-metrics] 2025-09-07T15:21:31.4484115Z [--log-operator-inputs] 2025-09-07T15:21:31.4484407Z [--channels-last] 2025-09-07T15:21:31.4484666Z [--batch-size BATCH_SIZE] 2025-09-07T15:21:31.4484962Z [--iterations ITERATIONS] 2025-09-07T15:21:31.4485283Z [--batch-size-file BATCH_SIZE_FILE] 2025-09-07T15:21:31.4485592Z [--cosine] 2025-09-07T15:21:31.4485807Z [--freezing] 2025-09-07T15:21:31.4486067Z [--inductor-config INDUCTOR_CONFIG] 2025-09-07T15:21:31.4486378Z [--ci] 2025-09-07T15:21:31.4486600Z [--dashboard] 2025-09-07T15:21:31.4486840Z [--skip-fp64-check] 2025-09-07T15:21:31.4487099Z [--fast] 2025-09-07T15:21:31.4487327Z [--only ONLY] 2025-09-07T15:21:31.4487579Z [--multiprocess] 2025-09-07T15:21:31.4487815Z [--ddp] 2025-09-07T15:21:31.4488037Z [--fsdp] 2025-09-07T15:21:31.4488302Z [--optimize-ddp-mode OPTIMIZE_DDP_MODE] 2025-09-07T15:21:31.4488709Z [--distributed-master-port DISTRIBUTED_MASTER_PORT] 2025-09-07T15:21:31.4489075Z [--dynamic-shapes] 2025-09-07T15:21:31.4489358Z [--propagate-real-tensors] 2025-09-07T15:21:31.4489667Z [--dynamic-batch-only] 2025-09-07T15:21:31.4489952Z [--specialize-int] 2025-09-07T15:21:31.4490215Z [--use-eval-mode] 2025-09-07T15:21:31.4490487Z [--skip-accuracy-check] 2025-09-07T15:21:31.4490791Z [--generate-aot-autograd-stats] 2025-09-07T15:21:31.4491114Z [--inductor-settings] 2025-09-07T15:21:31.4491388Z [--suppress-errors] 2025-09-07T15:21:31.4491655Z [--output OUTPUT] 2025-09-07T15:21:31.4491942Z [--output-directory OUTPUT_DIRECTORY] 2025-09-07T15:21:31.4492271Z [--disable-output] 2025-09-07T15:21:31.4492712Z [--baseline BASELINE] 2025-09-07T15:21:31.4492992Z [--part PART] 2025-09-07T15:21:31.4493338Z [--export-profiler-trace] 2025-09-07T15:21:31.4493762Z [--profiler-trace-name PROFILER_TRACE_NAME] 2025-09-07T15:21:31.4494102Z [--profile-details] 2025-09-07T15:21:31.4494380Z [--export-perfdoctor] 2025-09-07T15:21:31.4494669Z [--diff-branch DIFF_BRANCH] 2025-09-07T15:21:31.4494960Z [--tag TAG] 2025-09-07T15:21:31.4495176Z [--explain] 2025-09-07T15:21:31.4495405Z [--stats] 2025-09-07T15:21:31.4495647Z [--use-warm-peak-memory] 2025-09-07T15:21:31.4495931Z [--print-memory] 2025-09-07T15:21:31.4496188Z [--print-compilation-time] 2025-09-07T15:21:31.4496502Z [--print-dataframe-summary] 2025-09-07T15:21:31.4496810Z [--disable-cudagraphs] 2025-09-07T15:21:31.4497114Z [--disable-split-reductions] 2025-09-07T15:21:31.4497530Z [--disable-persistent-reductions] 2025-09-07T15:21:31.4497882Z [--disable-divisible-by-16] 2025-09-07T15:21:31.4498236Z [--inductor-compile-mode INDUCTOR_COMPILE_MODE] 2025-09-07T15:21:31.4498606Z [--print-graph-breaks] 2025-09-07T15:21:31.4498908Z [--log-graph-breaks] 2025-09-07T15:21:31.4499180Z [--trace-on-xla] 2025-09-07T15:21:31.4499452Z [--xla-tolerance XLA_TOLERANCE] 2025-09-07T15:21:31.4499772Z [--collect-outputs] 2025-09-07T15:21:31.4500061Z [--enable-activation-checkpointing] 2025-09-07T15:21:31.4500381Z [--timing] 2025-09-07T15:21:31.4500697Z [--progress] 2025-09-07T15:21:31.4500942Z [--timeout TIMEOUT] 2025-09-07T15:21:31.4501286Z [--per_process_memory_fraction PER_PROCESS_MEMORY_FRACTION] 2025-09-07T15:21:31.4501706Z [--no-translation-validation] 2025-09-07T15:21:31.4502012Z [--minify] 2025-09-07T15:21:31.4502260Z [--compiled-autograd] 2025-09-07T15:21:31.4502543Z [--profile_dynamo_cache_lookup] 2025-09-07T15:21:31.4502860Z [--snapshot-memory] 2025-09-07T15:21:31.4503128Z [--retain-output] 2025-09-07T15:21:31.4503397Z [--caching-precompile] 2025-09-07T15:21:31.4503726Z [--cold-start-latency | --warm-start-latency] 2025-09-07T15:21:31.4504069Z [--nnc] 2025-09-07T15:21:31.4504330Z [--float16 | --bfloat16 | --float32 | --amp] 2025-09-07T15:21:31.4504687Z [--amp-dtype {bfloat16,float16}] 2025-09-07T15:21:31.4504991Z [--verbose | --quiet] 2025-09-07T15:21:31.4508516Z [--coverage | --overhead | --speedup-dynamo-ts | --speedup-fx2trt | --speedup-fx2trt-fp16 | --print-fx | --print-aten-ops | --inductor | --quantization {int8dynamic,int8weightonly,int4weightonly,autoquant,noquant} | --export | --export-aot-inductor | --export-nativert | --torchscript-jit-trace | --xla | --backend {aot_eager,aot_eager_decomp_partition,aot_eager_decomp_partition_crossref,aot_eager_decomp_partition_with_mode,aot_eager_default_partitioner,aot_ts,cudagraphs,dynamo_accuracy_minifier_backend,dynamo_minifier_backend,eager,eager_debug,eager_noexcept,inductor,non_leaf_compile_error_TESTING_ONLY,openxla,openxla_eval,pre_dispatch_eager,relu_accuracy_error_TESTING_ONLY,relu_compile_error_TESTING_ONLY,relu_runtime_error_TESTING_ONLY,ts,tvm} | --nothing | --log-conv-args | --recompile-profiler | --find-batch-sizes] 2025-09-07T15:21:31.4512105Z (--accuracy | --performance | --tolerance) 2025-09-07T15:21:31.4512446Z (--training | --inference) 2025-09-07T15:21:31.4512869Z torchbench.py: error: argument --quantization: expected one argument 2025-09-07T15:21:32.1096524Z + true 2025-09-07T15:21:32.1097888Z + cp /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.csv /var/lib/jenkins/workspace/test/test-reports/inductor_cudagraphs_low_precision_torchbench_quant_inference_cuda_accuracy.csv 2025-09-07T15:21:32.1114437Z + for target in "${targets[@]}" 2025-09-07T15:21:32.1114771Z + target_flag=('--performance') 2025-09-07T15:21:32.1115049Z + local target_flag 2025-09-07T15:21:32.1115545Z + [[ performance == \p\e\r\f\o\r\m\a\n\c\e ]] 2025-09-07T15:21:32.1115997Z + target_flag+=(--cold-start-latency) 2025-09-07T15:21:32.1117485Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing-true* ]] 2025-09-07T15:21:32.1119917Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *default-true* ]] 2025-09-07T15:21:32.1122529Z + python benchmarks/dynamo/torchbench.py --performance --cold-start-latency --inference --bfloat16 --backend inductor --disable-cudagraphs --device cuda --total-partitions 6 --partition-id 5 --output /var/lib/jenkins/workspace/test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance.csv 2025-09-07T15:21:32.6700786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:21:32.6702270Z import pynvml # type: ignore[import] 2025-09-07T15:21:36.2325183Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T15:21:36.2326890Z import pynvml # type: ignore[import] 2025-09-07T15:21:38.8290280Z 2025-09-07T15:21:39.6812128Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:21:39.6812520Z loading model: 0it [00:00, ?it/s] 2025-09-07T15:21:39.6814162Z cuda eval soft_actor_critic 2025-09-07T15:21:44.7793890Z 2025-09-07T15:21:44.8130061Z running benchmark: 0% 0/30 [00:00 2025-09-07T15:30:26.3879908Z W0907 15:30:26.385000 704654 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:30:26.3881335Z W0907 15:30:26.385000 704654 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T15:30:26.3882718Z W0907 15:30:26.385000 704654 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T15:30:26.3883891Z W0907 15:30:26.385000 704654 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:30:26.3884564Z W0907 15:30:26.385000 704654 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:30:42.2179975Z 2025-09-07T15:30:42.3293277Z running benchmark: 0% 0/30 [00:00 2025-09-07T15:40:20.9552637Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9553044Z 2025-09-07T15:40:20.9553242Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:40:20.9553968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:40:20.9554683Z anchors = self.anchor_generator(images, features) 2025-09-07T15:40:20.9555446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:40:20.9556203Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:40:20.9557351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:40:20.9558376Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9559277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:40:20.9560175Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9560600Z 2025-09-07T15:40:20.9560783Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:40:20.9561515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:40:20.9562225Z anchors = self.anchor_generator(images, features) 2025-09-07T15:40:20.9563188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:40:20.9563947Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:40:20.9564747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:40:20.9565683Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9566584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:40:20.9567564Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9567983Z 2025-09-07T15:40:20.9568164Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:40:20.9568893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:40:20.9569605Z anchors = self.anchor_generator(images, features) 2025-09-07T15:40:20.9570354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:40:20.9571098Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:40:20.9571886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:40:20.9572809Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9573710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:40:20.9574608Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9575008Z 2025-09-07T15:40:20.9575187Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:40:20.9575922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:40:20.9576630Z anchors = self.anchor_generator(images, features) 2025-09-07T15:40:20.9577452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:40:20.9578212Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:40:20.9578993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:40:20.9579923Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9580825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:40:20.9581726Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:40:20.9582130Z 2025-09-07T15:40:20.9759588Z cudagraph partition into 2 partitions 2025-09-07T15:40:23.9472555Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T15:40:23.9473821Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T15:40:23.9474725Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] or: 2025-09-07T15:40:23.9475577Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T15:40:23.9476629Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] to include these operations in the captured graph. 2025-09-07T15:40:23.9477501Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:40:23.9478318Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break: from user code at: 2025-09-07T15:40:23.9479856Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T15:40:23.9481639Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T15:40:23.9483552Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T15:40:23.9485037Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T15:40:23.9486439Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T15:40:23.9487759Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T15:40:23.9489103Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T15:40:23.9490536Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:40:23.9491959Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T15:40:23.9493365Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:40:23.9494785Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T15:40:23.9496152Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T15:40:23.9497061Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:40:23.9497809Z W0907 15:40:23.946000 725176 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:40:24.2149091Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:40:24.2150167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T15:40:24.2151223Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T15:40:24.2151592Z 2025-09-07T15:40:28.2714227Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:40:28.2715130Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T15:40:28.2716063Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T15:40:28.2716420Z 2025-09-07T15:40:35.4454114Z 2025-09-07T15:40:35.5773796Z running benchmark: 0% 0/30 [00:00 2025-09-07T15:50:25.1991608Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.1992021Z 2025-09-07T15:50:25.1992234Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:50:25.1992968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:50:25.1993676Z anchors = self.anchor_generator(images, features) 2025-09-07T15:50:25.1994416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:50:25.1995172Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:50:25.1995968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:50:25.1996904Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.1997807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:50:25.1998789Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.1999208Z 2025-09-07T15:50:25.1999390Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:50:25.2000123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:50:25.2000841Z anchors = self.anchor_generator(images, features) 2025-09-07T15:50:25.2001595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:50:25.2002339Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:50:25.2003382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:50:25.2004320Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2005234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:50:25.2006136Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2006537Z 2025-09-07T15:50:25.2006715Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:50:25.2007441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:50:25.2008153Z anchors = self.anchor_generator(images, features) 2025-09-07T15:50:25.2008907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:50:25.2009657Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:50:25.2010433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:50:25.2011361Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2012264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:50:25.2013162Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2013564Z 2025-09-07T15:50:25.2013830Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T15:50:25.2014609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T15:50:25.2015369Z anchors = self.anchor_generator(images, features) 2025-09-07T15:50:25.2016119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T15:50:25.2016872Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T15:50:25.2017743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T15:50:25.2018662Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2019566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T15:50:25.2020472Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T15:50:25.2020874Z 2025-09-07T15:50:25.2201478Z cudagraph partition into 2 partitions 2025-09-07T15:50:28.2706373Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T15:50:28.2707498Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T15:50:28.2708407Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] or: 2025-09-07T15:50:28.2709620Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T15:50:28.2710670Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] to include these operations in the captured graph. 2025-09-07T15:50:28.2711555Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:50:28.2712362Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break: from user code at: 2025-09-07T15:50:28.2713896Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T15:50:28.2715671Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T15:50:28.2717269Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T15:50:28.2718758Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T15:50:28.2720155Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T15:50:28.2721468Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T15:50:28.2723053Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T15:50:28.2724499Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:50:28.2726131Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T15:50:28.2727638Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T15:50:28.2729064Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T15:50:28.2730447Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T15:50:28.2731356Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:50:28.2732024Z W0907 15:50:28.269000 745687 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T15:50:28.5448270Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:50:28.5449171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T15:50:28.5450088Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T15:50:28.5450446Z 2025-09-07T15:50:32.6013279Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T15:50:32.6014186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T15:50:32.6015502Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T15:50:32.6015861Z 2025-09-07T15:50:39.6681672Z 2025-09-07T15:50:39.7745968Z running benchmark: 0% 0/30 [00:00 2025-09-07T16:04:24.7486327Z W0907 16:04:24.745000 767779 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T16:04:24.7487868Z W0907 16:04:24.745000 767779 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T16:04:24.7489350Z W0907 16:04:24.745000 767779 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T16:04:24.7490348Z W0907 16:04:24.745000 767779 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:04:24.7491113Z W0907 16:04:24.745000 767779 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:04:51.6368040Z 2025-09-07T16:04:51.7470253Z running benchmark: 0% 0/30 [00:00 2025-09-07T16:15:16.0652170Z W0907 16:15:16.062000 787684 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T16:15:16.0653597Z W0907 16:15:16.062000 787684 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T16:15:16.0654978Z W0907 16:15:16.062000 787684 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T16:15:16.0655892Z W0907 16:15:16.062000 787684 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:15:16.0656569Z W0907 16:15:16.062000 787684 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:15:16.3253626Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:15:16.3254516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T16:15:16.3255446Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T16:15:16.3255799Z 2025-09-07T16:15:20.6854393Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:15:20.6855644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T16:15:20.6856656Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T16:15:20.6857125Z 2025-09-07T16:15:27.4117495Z 2025-09-07T16:15:27.5203078Z running benchmark: 0% 0/30 [00:00 2025-09-07T16:32:05.9576062Z W0907 16:32:05.954000 857233 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T16:32:05.9577553Z W0907 16:32:05.954000 857233 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T16:32:05.9579076Z W0907 16:32:05.954000 857233 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T16:32:05.9580078Z W0907 16:32:05.954000 857233 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:32:05.9580766Z W0907 16:32:05.954000 857233 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:32:06.2269727Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:32:06.2270610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T16:32:06.2271541Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T16:32:06.2271893Z 2025-09-07T16:32:10.2909713Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:32:10.2910619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T16:32:10.2911555Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T16:32:10.2911915Z 2025-09-07T16:32:17.2448531Z 2025-09-07T16:32:17.3532170Z running benchmark: 0% 0/30 [00:00 found. Please update your _register_pytree_node call with a `serialized_type_name` kwarg. 2025-09-07T16:35:42.1567860Z warmup_failed 2025-09-07T16:35:43.6946919Z Run failed with return code: 255 2025-09-07T16:35:43.6947309Z Output: None 2025-09-07T16:35:43.6947544Z Error: None 2025-09-07T16:35:44.2505078Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T16:35:44.2506484Z import pynvml # type: ignore[import] 2025-09-07T16:35:46.8620108Z 2025-09-07T16:35:49.2893507Z loading model: 0it [00:00, ?it/s] 2025-09-07T16:35:49.2893909Z loading model: 0it [00:02, ?it/s] 2025-09-07T16:35:49.2955704Z cuda eval timm_efficientnet 2025-09-07T16:36:16.7918652Z 2025-09-07T16:36:19.2584395Z running benchmark: 0% 0/30 [00:00", line 1, in 2025-09-07T16:41:26.7550086Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1593, in wrapped_fn 2025-09-07T16:41:26.7550699Z return tuple(flat_fn(*args)) 2025-09-07T16:41:26.7551321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/utils.py", line 187, in flat_fn 2025-09-07T16:41:26.7551986Z tree_out = fn(*args, **kwargs) 2025-09-07T16:41:26.7552749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_capture_wrappers.py", line 1354, in functional_call 2025-09-07T16:41:26.7553551Z out = mod(*args[params_len:], **kwargs) 2025-09-07T16:41:26.7554223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T16:41:26.7554926Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T16:41:26.7555664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T16:41:26.7556494Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T16:41:26.7557191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T16:41:26.7557830Z ret_val = forward(*args, **kwargs) 2025-09-07T16:41:26.7558419Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T16:41:26.7559067Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T16:41:26.7559772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T16:41:26.7560519Z return self._call_impl(*args, **kwargs) 2025-09-07T16:41:26.7561145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T16:41:26.7561785Z return forward_call(*args, **kwargs) 2025-09-07T16:41:26.7562379Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_trace.py", line 1906, in forward 2025-09-07T16:41:26.7563119Z tree_out = mod(*args, **kwargs) 2025-09-07T16:41:26.7563773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T16:41:26.7564480Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T16:41:26.7565223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T16:41:26.7565995Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T16:41:26.7566693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T16:41:26.7567326Z ret_val = forward(*args, **kwargs) 2025-09-07T16:41:26.7567912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T16:41:26.7568554Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T16:41:26.7569254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T16:41:26.7569938Z return self._call_impl(*args, **kwargs) 2025-09-07T16:41:26.7570564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T16:41:26.7571270Z return forward_call(*args, **kwargs) 2025-09-07T16:41:26.7571853Z File "/torchbench/torchbenchmark/models/tts_angular/model.py", line 73, in forward 2025-09-07T16:41:26.7572365Z d = self.layers(x) 2025-09-07T16:41:26.7572980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T16:41:26.7573690Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T16:41:26.7574431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T16:41:26.7575209Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T16:41:26.7575902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T16:41:26.7576539Z ret_val = forward(*args, **kwargs) 2025-09-07T16:41:26.7577129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T16:41:26.7577870Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T16:41:26.7578576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T16:41:26.7579272Z return self._call_impl(*args, **kwargs) 2025-09-07T16:41:26.7579897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T16:41:26.7580539Z return forward_call(*args, **kwargs) 2025-09-07T16:41:26.7581245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward 2025-09-07T16:41:26.7581879Z input = module(input) 2025-09-07T16:41:26.7582498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T16:41:26.7583204Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T16:41:26.7583949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T16:41:26.7584732Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T16:41:26.7585522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T16:41:26.7586158Z ret_val = forward(*args, **kwargs) 2025-09-07T16:41:26.7586744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T16:41:26.7587392Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T16:41:26.7588089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T16:41:26.7588773Z return self._call_impl(*args, **kwargs) 2025-09-07T16:41:26.7589398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T16:41:26.7590042Z return forward_call(*args, **kwargs) 2025-09-07T16:41:26.7590550Z File "/torchbench/torchbenchmark/models/tts_angular/model.py", line 18, in forward 2025-09-07T16:41:26.7591060Z o, (_, _) = self.lstm(x) 2025-09-07T16:41:26.7591683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 843, in module_call_wrapper 2025-09-07T16:41:26.7592383Z return self.call_module(mod, forward, args, kwargs) 2025-09-07T16:41:26.7593123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1997, in call_module 2025-09-07T16:41:26.7593889Z return Tracer.call_module(self, m, forward, args, kwargs) 2025-09-07T16:41:26.7594593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 560, in call_module 2025-09-07T16:41:26.7595216Z ret_val = forward(*args, **kwargs) 2025-09-07T16:41:26.7595870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_symbolic_trace.py", line 836, in forward 2025-09-07T16:41:26.7596521Z return _orig_module_call(mod, *args, **kwargs) 2025-09-07T16:41:26.7597268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl 2025-09-07T16:41:26.7597960Z return self._call_impl(*args, **kwargs) 2025-09-07T16:41:26.7598582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl 2025-09-07T16:41:26.7599219Z return forward_call(*args, **kwargs) 2025-09-07T16:41:26.7599816Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 1046, in forward 2025-09-07T16:41:26.7600417Z self._update_flat_weights() 2025-09-07T16:41:26.7601052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 393, in _update_flat_weights 2025-09-07T16:41:26.7601700Z self._init_flat_weights() 2025-09-07T16:41:26.7602322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 215, in _init_flat_weights 2025-09-07T16:41:26.7603111Z self.flatten_parameters() 2025-09-07T16:41:26.7603736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 255, in flatten_parameters 2025-09-07T16:41:26.7604373Z unique_data_ptrs = { 2025-09-07T16:41:26.7604931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 256, in 2025-09-07T16:41:26.7605567Z p.data_ptr() # type: ignore[union-attr] 2025-09-07T16:41:26.7606374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1409, in __torch_function__ 2025-09-07T16:41:26.7607102Z return func(*args, **kwargs) 2025-09-07T16:41:26.7607786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/proxy_tensor.py", line 1479, in __torch_function__ 2025-09-07T16:41:26.7608517Z return func(*args, **kwargs) 2025-09-07T16:41:26.7609182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 1066, in __torch_function__ 2025-09-07T16:41:26.7609867Z return func(*args, **kwargs) 2025-09-07T16:41:26.7611538Z RuntimeError: Cannot access data pointer of Tensor (e.g. FakeTensor, FunctionalTensor). If you're using torch.compile/export/fx, it is likely that we are erroneously tracing into a custom kernel. To fix this, please wrap the custom kernel into an opaque custom op. Please see the following for details: https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-09-07T16:41:26.7613144Z warmup_failed 2025-09-07T16:41:27.7145974Z Run failed with return code: 255 2025-09-07T16:41:27.7146362Z Output: None 2025-09-07T16:41:27.7146592Z Error: None 2025-09-07T16:41:28.2783869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T16:41:28.2785375Z import pynvml # type: ignore[import] 2025-09-07T16:41:30.8891438Z 2025-09-07T16:41:33.3199923Z loading model: 0it [00:00, ?it/s] 2025-09-07T16:41:33.3200295Z loading model: 0it [00:02, ?it/s] 2025-09-07T16:41:33.3209043Z cuda eval vgg16 2025-09-07T16:41:51.6341274Z 2025-09-07T16:41:51.7361266Z running benchmark: 0% 0/30 [00:00 2025-09-07T16:57:36.2935343Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2935749Z 2025-09-07T16:57:36.2936066Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T16:57:36.2936811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T16:57:36.2937577Z anchors = self.anchor_generator(images, features) 2025-09-07T16:57:36.2938340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T16:57:36.2939103Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T16:57:36.2939902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T16:57:36.2940835Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2941730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T16:57:36.2942637Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2943053Z 2025-09-07T16:57:36.2943233Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T16:57:36.2943962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T16:57:36.2944676Z anchors = self.anchor_generator(images, features) 2025-09-07T16:57:36.2945407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T16:57:36.2946237Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T16:57:36.2947029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T16:57:36.2947965Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2948926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T16:57:36.2949820Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2950288Z 2025-09-07T16:57:36.2950470Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T16:57:36.2951210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T16:57:36.2951925Z anchors = self.anchor_generator(images, features) 2025-09-07T16:57:36.2952678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T16:57:36.2953422Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T16:57:36.2954214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T16:57:36.2955145Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2956052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T16:57:36.2956955Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2957356Z 2025-09-07T16:57:36.2957534Z cudagraph partition due to DeviceCopy ops. Found from : 2025-09-07T16:57:36.2958264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/rpn.py", line 361, in forward 2025-09-07T16:57:36.2958977Z anchors = self.anchor_generator(images, features) 2025-09-07T16:57:36.2959727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 130, in forward 2025-09-07T16:57:36.2960481Z cell_anchors = self.set_cell_anchors(dtype, device) 2025-09-07T16:57:36.2961312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in set_cell_anchors 2025-09-07T16:57:36.2962250Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2963284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/anchor_utils.py", line 77, in 2025-09-07T16:57:36.2964185Z return [cell_anchor.to(dtype=dtype, device=device) for cell_anchor in self.cell_anchors] 2025-09-07T16:57:36.2964587Z 2025-09-07T16:57:36.3127275Z cudagraph partition into 2 partitions 2025-09-07T16:57:40.4249953Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break from `Tensor.item()`, consider setting: 2025-09-07T16:57:40.4251088Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] torch._dynamo.config.capture_scalar_outputs = True 2025-09-07T16:57:40.4251997Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] or: 2025-09-07T16:57:40.4252872Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] env TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1 2025-09-07T16:57:40.4253912Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] to include these operations in the captured graph. 2025-09-07T16:57:40.4254793Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:57:40.4255604Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] Graph break: from user code at: 2025-09-07T16:57:40.4257504Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/generalized_rcnn.py", line 118, in torch_dynamo_resume_in_forward_at_117 2025-09-07T16:57:40.4259414Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) 2025-09-07T16:57:40.4261015Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/models/detection/roi_heads.py", line 763, in forward 2025-09-07T16:57:40.4262573Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] box_features = self.box_roi_pool(features, proposals, image_shapes) 2025-09-07T16:57:40.4263982Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 310, in forward 2025-09-07T16:57:40.4265298Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] self.scales, self.map_levels = _setup_scales( 2025-09-07T16:57:40.4266643Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in _setup_scales 2025-09-07T16:57:40.4268071Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T16:57:40.4269486Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 122, in 2025-09-07T16:57:40.4270906Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scales = [_infer_scale(feat, original_input_shape) for feat in features] 2025-09-07T16:57:40.4272326Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 104, in _infer_scale 2025-09-07T16:57:40.4273808Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] scale = 2 ** float(torch.tensor(approx_scale).log2().round()) 2025-09-07T16:57:40.4274728Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:57:40.4275418Z W0907 16:57:40.424000 950721 site-packages/torch/_dynamo/variables/tensor.py:1048] [12/0] 2025-09-07T16:57:40.6972505Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:57:40.6973350Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 125, in torch_dynamo_resume_in__setup_scales_at_122 2025-09-07T16:57:40.6974271Z lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item() 2025-09-07T16:57:40.6974637Z 2025-09-07T16:57:44.7618907Z cudagraph partition due to non gpu ops. Found from : 2025-09-07T16:57:44.7619823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torchvision/ops/poolers.py", line 126, in torch_dynamo_resume_in__setup_scales_at_125 2025-09-07T16:57:44.7620739Z lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item() 2025-09-07T16:57:44.7621095Z 2025-09-07T16:57:51.6849450Z 2025-09-07T16:57:51.7933559Z running benchmark: 0% 0/30 [00:00> $GITHUB_ENV 2025-09-07T16:59:03.6023683Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-09-07T16:59:03.6036331Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T16:59:03.6036724Z env: 2025-09-07T16:59:03.6036958Z GIT_DEFAULT_BRANCH: main 2025-09-07T16:59:03.6037298Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T16:59:03.6037771Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T16:59:03.6038355Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T16:59:03.6038858Z ##[endgroup] 2025-09-07T16:59:03.6068521Z + [[ -n '' ]] 2025-09-07T16:59:03.6068883Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-09-07T16:59:03.9141549Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T16:59:04.7339506Z Collecting boto3==1.35.33 2025-09-07T16:59:04.8037604Z Downloading boto3-1.35.33-py3-none-any.whl (139 kB) 2025-09-07T16:59:04.8386109Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.1/139.1 KB 4.0 MB/s eta 0:00:00 2025-09-07T16:59:05.0045723Z Collecting psutil==7.0.0 2025-09-07T16:59:05.0173525Z Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB) 2025-09-07T16:59:05.0559228Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 278.0/278.0 KB 7.3 MB/s eta 0:00:00 2025-09-07T16:59:05.0938280Z Collecting pynvml==12.0.0 2025-09-07T16:59:05.1023320Z Downloading pynvml-12.0.0-py3-none-any.whl (26 kB) 2025-09-07T16:59:05.1492924Z Collecting s3transfer<0.11.0,>=0.10.0 2025-09-07T16:59:05.1578418Z Downloading s3transfer-0.10.4-py3-none-any.whl (83 kB) 2025-09-07T16:59:05.1706686Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.2/83.2 KB 6.5 MB/s eta 0:00:00 2025-09-07T16:59:05.1971184Z Collecting jmespath<2.0.0,>=0.7.1 2025-09-07T16:59:05.2067415Z Downloading jmespath-1.0.1-py3-none-any.whl (20 kB) 2025-09-07T16:59:05.9690046Z Collecting botocore<1.36.0,>=1.35.33 2025-09-07T16:59:05.9814660Z Downloading botocore-1.35.99-py3-none-any.whl (13.3 MB) 2025-09-07T16:59:06.3389207Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 48.7 MB/s eta 0:00:00 2025-09-07T16:59:06.4309290Z Collecting nvidia-ml-py<13.0.0a0,>=12.0.0 2025-09-07T16:59:06.4397871Z Downloading nvidia_ml_py-12.575.51-py3-none-any.whl (47 kB) 2025-09-07T16:59:06.4466357Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.5/47.5 KB 7.3 MB/s eta 0:00:00 2025-09-07T16:59:06.4538766Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.26.5) 2025-09-07T16:59:06.4546316Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.1) 2025-09-07T16:59:06.7765735Z Installing collected packages: nvidia-ml-py, pynvml, psutil, jmespath, botocore, s3transfer, boto3 2025-09-07T16:59:06.7766563Z Attempting uninstall: nvidia-ml-py 2025-09-07T16:59:06.7772120Z Found existing installation: nvidia-ml-py 11.525.84 2025-09-07T16:59:06.7809450Z Uninstalling nvidia-ml-py-11.525.84: 2025-09-07T16:59:06.7834036Z Successfully uninstalled nvidia-ml-py-11.525.84 2025-09-07T16:59:06.8543620Z Attempting uninstall: psutil 2025-09-07T16:59:06.8550891Z Found existing installation: psutil 5.9.8 2025-09-07T16:59:06.8700520Z Uninstalling psutil-5.9.8: 2025-09-07T16:59:06.8709395Z Successfully uninstalled psutil-5.9.8 2025-09-07T16:59:07.6727829Z Successfully installed boto3-1.35.33 botocore-1.35.99 jmespath-1.0.1 nvidia-ml-py-12.575.51 psutil-7.0.0 pynvml-12.0.0 s3transfer-0.10.4 2025-09-07T16:59:07.7801225Z + DEVICE_NAME= 2025-09-07T16:59:07.7801467Z + DEVICE_TYPE= 2025-09-07T16:59:07.7801729Z + command -v nvidia-smi 2025-09-07T16:59:07.7802241Z + python3 -mpip install torch==2.7.1 2025-09-07T16:59:07.7802791Z /usr/bin/nvidia-smi 2025-09-07T16:59:08.0838240Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T16:59:08.3334838Z Collecting torch==2.7.1 2025-09-07T16:59:08.3871507Z Downloading torch-2.7.1-cp310-cp310-manylinux_2_28_x86_64.whl (821.2 MB) 2025-09-07T16:59:19.0945172Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 821.2/821.2 MB 1.4 MB/s eta 0:00:00 2025-09-07T16:59:21.2809485Z Collecting nvidia-cuda-nvrtc-cu12==12.6.77 2025-09-07T16:59:21.2845507Z Downloading nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB) 2025-09-07T16:59:21.4677576Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 87.9 MB/s eta 0:00:00 2025-09-07T16:59:21.5551515Z Collecting nvidia-cudnn-cu12==9.5.1.17 2025-09-07T16:59:21.5600672Z Downloading nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB) 2025-09-07T16:59:29.3614947Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 571.0/571.0 MB 2.1 MB/s eta 0:00:00 2025-09-07T16:59:30.8054938Z Collecting nvidia-cusparselt-cu12==0.6.3 2025-09-07T16:59:30.8124546Z Downloading nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB) 2025-09-07T16:59:32.1955935Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 156.8/156.8 MB 15.7 MB/s eta 0:00:00 2025-09-07T16:59:32.6277282Z Collecting filelock 2025-09-07T16:59:32.6312526Z Downloading filelock-3.19.1-py3-none-any.whl (15 kB) 2025-09-07T16:59:32.6527949Z Collecting nvidia-cufile-cu12==1.11.1.6 2025-09-07T16:59:32.6564070Z Downloading nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB) 2025-09-07T16:59:32.6739072Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 75.6 MB/s eta 0:00:00 2025-09-07T16:59:32.7002717Z Collecting nvidia-nvjitlink-cu12==12.6.85 2025-09-07T16:59:32.7036495Z Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB) 2025-09-07T16:59:32.9798482Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.7/19.7 MB 61.7 MB/s eta 0:00:00 2025-09-07T16:59:33.0740796Z Collecting networkx 2025-09-07T16:59:33.0903947Z Downloading networkx-3.4.2-py3-none-any.whl (1.7 MB) 2025-09-07T16:59:33.1071366Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 114.0 MB/s eta 0:00:00 2025-09-07T16:59:33.1572037Z Collecting nvidia-cublas-cu12==12.6.4.1 2025-09-07T16:59:33.1605891Z Downloading nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB) 2025-09-07T16:59:38.1279574Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 393.1/393.1 MB 3.5 MB/s eta 0:00:00 2025-09-07T16:59:39.1492440Z Collecting fsspec 2025-09-07T16:59:39.1532436Z Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB) 2025-09-07T16:59:39.1615710Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 29.6 MB/s eta 0:00:00 2025-09-07T16:59:39.1857549Z Collecting nvidia-cusolver-cu12==11.7.1.2 2025-09-07T16:59:39.1892945Z Downloading nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB) 2025-09-07T16:59:40.6059297Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.2/158.2 MB 15.9 MB/s eta 0:00:00 2025-09-07T16:59:41.0298915Z Collecting triton==3.3.1 2025-09-07T16:59:41.0367823Z Downloading triton-3.3.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.6 MB) 2025-09-07T16:59:42.3173340Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.6/155.6 MB 17.1 MB/s eta 0:00:00 2025-09-07T16:59:42.7301169Z Collecting nvidia-cuda-cupti-cu12==12.6.80 2025-09-07T16:59:42.7356510Z Downloading nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB) 2025-09-07T16:59:42.8208768Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.9/8.9 MB 107.2 MB/s eta 0:00:00 2025-09-07T16:59:42.8656760Z Collecting nvidia-cuda-runtime-cu12==12.6.77 2025-09-07T16:59:42.8692112Z Downloading nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB) 2025-09-07T16:59:42.8855945Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.7/897.7 KB 64.0 MB/s eta 0:00:00 2025-09-07T16:59:42.9111121Z Collecting nvidia-cusparse-cu12==12.5.4.2 2025-09-07T16:59:42.9144741Z Downloading nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB) 2025-09-07T16:59:46.4187010Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 216.6/216.6 MB 4.1 MB/s eta 0:00:00 2025-09-07T16:59:47.0943291Z Collecting jinja2 2025-09-07T16:59:47.0975323Z Downloading jinja2-3.1.6-py3-none-any.whl (134 kB) 2025-09-07T16:59:47.2050520Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 1.2 MB/s eta 0:00:00 2025-09-07T16:59:47.3460486Z Collecting nvidia-curand-cu12==10.3.7.77 2025-09-07T16:59:47.3498526Z Downloading nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB) 2025-09-07T16:59:48.6107303Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 10.8 MB/s eta 0:00:00 2025-09-07T16:59:48.7536463Z Requirement already satisfied: typing-extensions>=4.10.0 in /home/charlie/.local/lib/python3.10/site-packages (from torch==2.7.1) (4.15.0) 2025-09-07T16:59:48.8626514Z Collecting nvidia-nccl-cu12==2.26.2 2025-09-07T16:59:48.8660262Z Downloading nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB) 2025-09-07T16:59:52.3385923Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.3/201.3 MB 4.3 MB/s eta 0:00:00 2025-09-07T16:59:53.0102619Z Collecting sympy>=1.13.3 2025-09-07T16:59:53.0135062Z Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) 2025-09-07T16:59:53.2300589Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 29.3 MB/s eta 0:00:00 2025-09-07T16:59:53.3695395Z Collecting nvidia-cufft-cu12==11.3.0.4 2025-09-07T16:59:53.3730930Z Downloading nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB) 2025-09-07T16:59:56.9738287Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.2/200.2 MB 4.2 MB/s eta 0:00:00 2025-09-07T16:59:57.5791726Z Collecting nvidia-nvtx-cu12==12.6.77 2025-09-07T16:59:57.5826966Z Downloading nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) 2025-09-07T16:59:57.6955737Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.3/89.3 KB 712.3 kB/s eta 0:00:00 2025-09-07T16:59:57.7243779Z Requirement already satisfied: setuptools>=40.8.0 in /usr/lib/python3/dist-packages (from triton==3.3.1->torch==2.7.1) (59.6.0) 2025-09-07T16:59:57.9006710Z Collecting mpmath<1.4,>=1.1.0 2025-09-07T16:59:57.9036651Z Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) 2025-09-07T16:59:58.0219952Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 4.5 MB/s eta 0:00:00 2025-09-07T16:59:58.3199411Z Collecting MarkupSafe>=2.0 2025-09-07T16:59:58.3232522Z Downloading MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20 kB) 2025-09-07T16:59:58.7060650Z Installing collected packages: nvidia-cusparselt-cu12, mpmath, triton, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, nvidia-cusparse-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch 2025-09-07T17:00:03.2024318Z WARNING: The scripts proton and proton-viewer are installed in '/home/charlie/.local/bin' which is not on PATH. 2025-09-07T17:00:03.2025281Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T17:00:07.2450145Z WARNING: The script isympy is installed in '/home/charlie/.local/bin' which is not on PATH. 2025-09-07T17:00:07.2450989Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T17:00:42.4349835Z WARNING: The scripts torchfrtrace and torchrun are installed in '/home/charlie/.local/bin' which is not on PATH. 2025-09-07T17:00:42.4350806Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T17:00:42.7281815Z Successfully installed MarkupSafe-3.0.2 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.4.2 nvidia-cublas-cu12-12.6.4.1 nvidia-cuda-cupti-cu12-12.6.80 nvidia-cuda-nvrtc-cu12-12.6.77 nvidia-cuda-runtime-cu12-12.6.77 nvidia-cudnn-cu12-9.5.1.17 nvidia-cufft-cu12-11.3.0.4 nvidia-cufile-cu12-1.11.1.6 nvidia-curand-cu12-10.3.7.77 nvidia-cusolver-cu12-11.7.1.2 nvidia-cusparse-cu12-12.5.4.2 nvidia-cusparselt-cu12-0.6.3 nvidia-nccl-cu12-2.26.2 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.6.77 sympy-1.14.0 torch-2.7.1 triton-3.3.1 2025-09-07T17:00:43.2512721Z + echo DEVICE_NAME= 2025-09-07T17:00:43.2516276Z + echo DEVICE_TYPE= 2025-09-07T17:00:43.2643761Z ##[group]Run set -eux 2025-09-07T17:00:43.2644057Z set -eux 2025-09-07T17:00:43.2644295Z  2025-09-07T17:00:43.2644534Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-09-07T17:00:43.2644900Z  echo "Missing github-token input" 2025-09-07T17:00:43.2645231Z  exit 1 2025-09-07T17:00:43.2645465Z fi 2025-09-07T17:00:43.2658080Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:43.2658460Z env: 2025-09-07T17:00:43.2658688Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:43.2659028Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:43.2659627Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:43.2660230Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:43.2660739Z DEVICE_NAME: 2025-09-07T17:00:43.2660995Z DEVICE_TYPE: 2025-09-07T17:00:43.2661451Z GITHUB_TOKEN: *** 2025-09-07T17:00:43.2661695Z ##[endgroup] 2025-09-07T17:00:43.2769743Z + [[ -z *** ]] 2025-09-07T17:00:43.2953153Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-09-07T17:00:43.2953606Z with: 2025-09-07T17:00:43.2954031Z github-token: *** 2025-09-07T17:00:43.2954272Z env: 2025-09-07T17:00:43.2954475Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:43.2954826Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:43.2955295Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:43.2955892Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:43.2956402Z DEVICE_NAME: 2025-09-07T17:00:43.2956628Z DEVICE_TYPE: 2025-09-07T17:00:43.2956856Z ##[endgroup] 2025-09-07T17:00:43.3297774Z ##[group]Run set -eux 2025-09-07T17:00:43.3298065Z set -eux 2025-09-07T17:00:43.3298298Z  2025-09-07T17:00:43.3298796Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-09-07T17:00:43.3310754Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:43.3311149Z env: 2025-09-07T17:00:43.3311370Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:43.3311720Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:43.3312194Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:43.3312910Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:43.3313427Z DEVICE_NAME: 2025-09-07T17:00:43.3313663Z DEVICE_TYPE: 2025-09-07T17:00:43.3314131Z GITHUB_TOKEN: *** 2025-09-07T17:00:43.3314366Z ##[endgroup] 2025-09-07T17:00:43.3426808Z + python3 /home/charlie/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 17525309334 i-03028b1668c838483-1003 2025-09-07T17:00:44.4221495Z setting job-id=49775768433 2025-09-07T17:00:44.4222066Z setting job-name=cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T17:00:44.4356393Z ##[group]Run set -eux 2025-09-07T17:00:44.4356708Z set -eux 2025-09-07T17:00:44.4356959Z  2025-09-07T17:00:44.4357184Z if [[ -n "" ]]; then 2025-09-07T17:00:44.4357483Z  source "" 2025-09-07T17:00:44.4357748Z fi 2025-09-07T17:00:44.4357996Z  2025-09-07T17:00:44.4358397Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-09-07T17:00:44.4358961Z  --schema-version "${SCHEMA_VERSION}" \ 2025-09-07T17:00:44.4359344Z  --repo "${REPO}" \ 2025-09-07T17:00:44.4359675Z  --head-branch "${HEAD_BRANCH}" \ 2025-09-07T17:00:44.4360024Z  --head-sha "${HEAD_SHA}" \ 2025-09-07T17:00:44.4360588Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-09-07T17:00:44.4360979Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-09-07T17:00:44.4361342Z  --job-id "${JOB_ID}" \ 2025-09-07T17:00:44.4361672Z  --job-name "${JOB_NAME}" 2025-09-07T17:00:44.4374765Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:44.4375182Z env: 2025-09-07T17:00:44.4375424Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:44.4375784Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:44.4376281Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:44.4376877Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:44.4377544Z DEVICE_NAME: 2025-09-07T17:00:44.4377802Z DEVICE_TYPE: 2025-09-07T17:00:44.4378059Z SCHEMA_VERSION: v3 2025-09-07T17:00:44.4378476Z REPO: pytorch/pytorch 2025-09-07T17:00:44.4378771Z HEAD_BRANCH: refs/heads/main 2025-09-07T17:00:44.4379132Z HEAD_SHA: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T17:00:44.4379507Z WORKFLOW_RUN_ID: 17525309334 2025-09-07T17:00:44.4379782Z RUN_ATTEMPT: 1 2025-09-07T17:00:44.4380035Z JOB_ID: 49775768433 2025-09-07T17:00:44.4380504Z JOB_NAME: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T17:00:44.4381039Z ##[endgroup] 2025-09-07T17:00:44.4484690Z + [[ -n '' ]] 2025-09-07T17:00:44.4487046Z + python3 /home/charlie/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha 93fb23d6fae7c4e82c4239a1033e522088742634 --workflow-id 17525309334 --run-attempt 1 --job-id 49775768433 --job-name 'cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)' 2025-09-07T17:00:44.4810423Z ##[group]Run set -eux 2025-09-07T17:00:44.4810700Z set -eux 2025-09-07T17:00:44.4810920Z  2025-09-07T17:00:44.4811167Z if [[ -n "" ]]; then 2025-09-07T17:00:44.4811447Z  source "" 2025-09-07T17:00:44.4811689Z fi 2025-09-07T17:00:44.4811888Z  2025-09-07T17:00:44.4812291Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-09-07T17:00:44.4824331Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:44.4824726Z env: 2025-09-07T17:00:44.4824949Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:44.4825284Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:44.4825874Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:44.4826476Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:44.4826985Z DEVICE_NAME: 2025-09-07T17:00:44.4827203Z DEVICE_TYPE: 2025-09-07T17:00:44.4827435Z ##[endgroup] 2025-09-07T17:00:44.4937830Z + [[ -n '' ]] 2025-09-07T17:00:44.4938642Z + python3 /home/charlie/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-09-07T17:00:45.3208564Z /home/charlie/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-09-07T17:00:45.3209835Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-09-07T17:00:46.5191968Z ##[group]Run set -eux 2025-09-07T17:00:46.5192265Z set -eux 2025-09-07T17:00:46.5192518Z  2025-09-07T17:00:46.5192765Z # TODO (huydhn): Implement this part 2025-09-07T17:00:46.5193165Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-09-07T17:00:46.5206087Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:46.5206501Z env: 2025-09-07T17:00:46.5206734Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:46.5207069Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:46.5207737Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:46.5208342Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:46.5208852Z DEVICE_NAME: 2025-09-07T17:00:46.5209075Z DEVICE_TYPE: 2025-09-07T17:00:46.5209308Z ##[endgroup] 2025-09-07T17:00:46.5296579Z + echo 'dependencies={}' 2025-09-07T17:00:46.5621693Z ##[group]Run set -eux 2025-09-07T17:00:46.5621985Z set -eux 2025-09-07T17:00:46.5622206Z  2025-09-07T17:00:46.5622442Z if [[ -n "" ]]; then 2025-09-07T17:00:46.5622720Z  source "" 2025-09-07T17:00:46.5622965Z fi 2025-09-07T17:00:46.5623170Z  2025-09-07T17:00:46.5623439Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-09-07T17:00:46.5623893Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-09-07T17:00:46.5624522Z  # We don't want the job to fail if the directory doesn't exist 2025-09-07T17:00:46.5624944Z  exit 0 2025-09-07T17:00:46.5625163Z fi 2025-09-07T17:00:46.5625379Z  2025-09-07T17:00:46.5625623Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-09-07T17:00:46.5626119Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-09-07T17:00:46.5626688Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-09-07T17:00:46.5627129Z  --metadata "${BENCHMARK_METADATA}" \ 2025-09-07T17:00:46.5627495Z  --runners "${RUNNER_INFO}" \ 2025-09-07T17:00:46.5627854Z  --dependencies "${DEPENDENCIES}" \ 2025-09-07T17:00:46.5628196Z  --dry-run 2025-09-07T17:00:46.5628438Z else 2025-09-07T17:00:46.5628831Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-09-07T17:00:46.5629404Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-09-07T17:00:46.5629844Z  --metadata "${BENCHMARK_METADATA}" \ 2025-09-07T17:00:46.5630195Z  --runners "${RUNNER_INFO}" \ 2025-09-07T17:00:46.5630545Z  --dependencies "${DEPENDENCIES}" 2025-09-07T17:00:46.5630870Z fi 2025-09-07T17:00:46.5642598Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:46.5643168Z env: 2025-09-07T17:00:46.5643379Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:46.5643724Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:46.5644195Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:46.5644913Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:46.5645409Z DEVICE_NAME: 2025-09-07T17:00:46.5645647Z DEVICE_TYPE: 2025-09-07T17:00:46.5645913Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-09-07T17:00:46.5646249Z DRY_RUN: false 2025-09-07T17:00:46.5647658Z BENCHMARK_METADATA: {"timestamp": 1757264444, "schema_version": "v3", "name": "cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "93fb23d6fae7c4e82c4239a1033e522088742634", "workflow_id": 17525309334, "run_attempt": 1, "job_id": 49775768433} 2025-09-07T17:00:46.5649638Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 96, "avail_mem_in_gb": 1121, "extra_info": {"hostname": "15a98ee0aa9d"}, "name": "cuda", "type": "NVIDIA A100-SXM4-40GB", "gpu_count": 1, "avail_gpu_mem_in_gb": 39}] 2025-09-07T17:00:46.5650481Z DEPENDENCIES: {} 2025-09-07T17:00:46.5650732Z ##[endgroup] 2025-09-07T17:00:46.5740237Z + [[ -n '' ]] 2025-09-07T17:00:46.5740473Z + [[ ! -d test/test-reports ]] 2025-09-07T17:00:46.5740768Z + [[ false == \t\r\u\e ]] 2025-09-07T17:00:46.5743921Z + python3 /home/charlie/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1757264444, "schema_version": "v3", "name": "cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "93fb23d6fae7c4e82c4239a1033e522088742634", "workflow_id": 17525309334, "run_attempt": 1, "job_id": 49775768433}' --runners '[{"cpu_info": "x86_64", "cpu_count": 96, "avail_mem_in_gb": 1121, "extra_info": {"hostname": "15a98ee0aa9d"}, "name": "cuda", "type": "NVIDIA A100-SXM4-40GB", "gpu_count": 1, "avail_gpu_mem_in_gb": 39}]' --dependencies '{}' 2025-09-07T17:00:46.7100370Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:46.7500457Z INFO:botocore.credentials:Found credentials from IAM Role: gh-ci-github-action-runners-runner-role 2025-09-07T17:00:46.9663283Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:47.0563069Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:47.1780679Z INFO:root:Upload test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:47.3249310Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:47.4959323Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance_compilation_metrics.json 2025-09-07T17:00:47.6357685Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:47.7834972Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:47.9658784Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:48.0928683Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:48.2619399Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:48.4620796Z INFO:root:Upload test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:48.5959405Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_amp_training_cuda_accuracy.json 2025-09-07T17:00:48.7299399Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_amp_training_cuda_accuracy.json 2025-09-07T17:00:48.8506670Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:48.9764178Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:49.0975778Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_amp_training_cuda_accuracy.json 2025-09-07T17:00:49.2553378Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:49.4816700Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_amp_training_cuda_performance_compilation_metrics.json 2025-09-07T17:00:49.6523496Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:49.8275372Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_amp_training_cuda_performance.json 2025-09-07T17:00:50.0420028Z INFO:root:Upload test/test-reports/inductor_export_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_export_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:50.1982215Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:50.3154616Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance.json 2025-09-07T17:00:50.4446212Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance.json 2025-09-07T17:00:50.6196957Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_amp_training_cuda_accuracy.json 2025-09-07T17:00:50.7234324Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:50.8476671Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:50.9808670Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_amp_training_cuda_accuracy.json 2025-09-07T17:00:51.1053427Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:51.2290264Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_amp_training_cuda_performance.json 2025-09-07T17:00:51.3854553Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_amp_training_cuda_performance_compilation_metrics.json 2025-09-07T17:00:51.5638583Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json 2025-09-07T17:00:51.7366020Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json 2025-09-07T17:00:51.8883441Z INFO:root:Upload test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:52.0127154Z INFO:root:Upload test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:52.1519883Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance.json 2025-09-07T17:00:52.3076801Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance.json 2025-09-07T17:00:52.4340363Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_accuracy.json 2025-09-07T17:00:52.5739704Z INFO:root:Upload test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525309334/49775768433/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json 2025-09-07T17:00:52.7960512Z ##[group]Run cat test/**/*_toprint.log || true 2025-09-07T17:00:52.7960923Z cat test/**/*_toprint.log || true 2025-09-07T17:00:52.7973816Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:52.7974219Z env: 2025-09-07T17:00:52.7974443Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:52.7974786Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:52.7975240Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:52.7975839Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:52.7976354Z DEVICE_NAME: 2025-09-07T17:00:52.7976588Z DEVICE_TYPE: 2025-09-07T17:00:52.7976812Z ##[endgroup] 2025-09-07T17:00:52.8165371Z cat: 'test/**/*_toprint.log': No such file or directory 2025-09-07T17:00:52.8410222Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2025-09-07T17:00:52.8410716Z kill "$MONITOR_SCRIPT_PID" 2025-09-07T17:00:52.8422747Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:52.8423145Z env: 2025-09-07T17:00:52.8423370Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:52.8423717Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:52.8424187Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:52.8424772Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:52.8425280Z DEVICE_NAME: 2025-09-07T17:00:52.8425514Z DEVICE_TYPE: 2025-09-07T17:00:52.8425753Z MONITOR_SCRIPT_PID: 11309 2025-09-07T17:00:52.8426019Z ##[endgroup] 2025-09-07T17:00:52.8663570Z Prepare all required actions 2025-09-07T17:00:52.8664082Z Getting action download info 2025-09-07T17:00:53.0147564Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-09-07T17:00:53.6895047Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-09-07T17:00:55.5940134Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-09-07T17:00:55.5940497Z with: 2025-09-07T17:00:55.5940867Z file-suffix: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T17:00:55.5941321Z s3-bucket: gha-artifacts 2025-09-07T17:00:55.5941594Z env: 2025-09-07T17:00:55.5941806Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:55.5942142Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:55.5942603Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:55.5943226Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:55.5943867Z DEVICE_NAME: 2025-09-07T17:00:55.5944076Z DEVICE_TYPE: 2025-09-07T17:00:55.5944323Z ##[endgroup] 2025-09-07T17:00:55.6385519Z ##[group]Run # Remove any previous test jsons if they exist 2025-09-07T17:00:55.6386014Z # Remove any previous test jsons if they exist 2025-09-07T17:00:55.6386423Z rm -f test-jsons-*.zip 2025-09-07T17:00:55.6386857Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test/test-reports -i '*.json' 2025-09-07T17:00:55.6399840Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:55.6400250Z env: 2025-09-07T17:00:55.6400486Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:55.6400841Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:55.6401303Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:55.6401913Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:55.6402445Z DEVICE_NAME: 2025-09-07T17:00:55.6402699Z DEVICE_TYPE: 2025-09-07T17:00:55.6403408Z FILE_SUFFIX: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T17:00:55.6403863Z ##[endgroup] 2025-09-07T17:00:55.6600718Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.6635726Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.6670450Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.6743965Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.6885428Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.7055539Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.7107403Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.7280962Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.7315729Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.7459081Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.7624534Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.7659802Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.7692576Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.7725652Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.7777555Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.7812770Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.7845446Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.7991365Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.8138867Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.8275771Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.8321553Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8356158Z adding: test/test-reports/inductor_export_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.8390700Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.8436669Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8488615Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8521488Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.8555923Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.8608164Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8642526Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.8694456Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8740280Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.8914128Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.9074711Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.9222159Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.9273954Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.9300986Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.9352737Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.9404991Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance.json (deflated 99%) 2025-09-07T17:00:55.9440204Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_accuracy.json (deflated 99%) 2025-09-07T17:00:55.9584078Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.json (deflated 99%) 2025-09-07T17:00:55.9722052Z ##[group]Run # Remove any previous test reports if they exist 2025-09-07T17:00:55.9722546Z # Remove any previous test reports if they exist 2025-09-07T17:00:55.9723189Z rm -f test-reports-*.zip 2025-09-07T17:00:55.9723668Z zip -r "test-reports-${FILE_SUFFIX}.zip" test/test-reports -i '*.xml' -i '*.csv' 2025-09-07T17:00:55.9735727Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:55.9736118Z env: 2025-09-07T17:00:55.9736340Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:55.9736677Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:55.9737126Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:55.9737827Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:55.9738338Z DEVICE_NAME: 2025-09-07T17:00:55.9738693Z DEVICE_TYPE: 2025-09-07T17:00:55.9739071Z FILE_SUFFIX: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T17:00:55.9739509Z ##[endgroup] 2025-09-07T17:00:55.9874985Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T17:00:55.9876028Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9877059Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9879159Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T17:00:55.9880202Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 58%) 2025-09-07T17:00:55.9881180Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9882145Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9883310Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9884429Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9885491Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 58%) 2025-09-07T17:00:55.9886479Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance.csv (deflated 51%) 2025-09-07T17:00:55.9887399Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance.csv (deflated 50%) 2025-09-07T17:00:55.9888367Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T17:00:55.9889309Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_accuracy.csv (deflated 57%) 2025-09-07T17:00:55.9890312Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T17:00:55.9891493Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9892421Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_accuracy.csv (deflated 57%) 2025-09-07T17:00:55.9894473Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 52%) 2025-09-07T17:00:55.9895607Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9896673Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_accuracy.csv (deflated 58%) 2025-09-07T17:00:55.9898534Z adding: test/test-reports/inductor_with_cudagraphs_freezing_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 52%) 2025-09-07T17:00:55.9899612Z adding: test/test-reports/inductor_export_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 61%) 2025-09-07T17:00:55.9900570Z adding: test/test-reports/inductor_cudagraphs_low_precision_torchbench_quant_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9901613Z adding: test/test-reports/inductor_cudagraphs_low_precision_torchbench_quant_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9902588Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 68%) 2025-09-07T17:00:55.9903780Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T17:00:55.9906853Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 52%) 2025-09-07T17:00:55.9909741Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 52%) 2025-09-07T17:00:55.9910786Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_amp_training_cuda_performance.csv (deflated 51%) 2025-09-07T17:00:55.9911712Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9912610Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_accuracy.csv (deflated 57%) 2025-09-07T17:00:55.9913498Z adding: test/test-reports/inductor_dynamic_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9914435Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_performance.csv (deflated 52%) 2025-09-07T17:00:55.9915835Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_amp_training_cuda_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T17:00:55.9916857Z adding: test/test-reports/inductor_no_cudagraphs_torchbench_bfloat16_inference_cuda_accuracy.csv (deflated 59%) 2025-09-07T17:00:55.9917769Z adding: test/test-reports/inductor_dynamic_torchbench_amp_training_cuda_performance.csv (deflated 50%) 2025-09-07T17:00:55.9918673Z adding: test/test-reports/inductor_aot_inductor_torchbench_bfloat16_inference_cuda_performance.csv (deflated 54%) 2025-09-07T17:00:55.9919612Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_accuracy.csv (deflated 56%) 2025-09-07T17:00:55.9920534Z adding: test/test-reports/inductor_with_cudagraphs_torchbench_amp_training_cuda_performance.csv (deflated 50%) 2025-09-07T17:00:55.9922325Z adding: test/test-reports/inductor_max_autotune_torchbench_amp_training_cuda_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T17:00:55.9925438Z adding: test/test-reports/inductor_cpp_wrapper_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T17:00:55.9928808Z adding: test/test-reports/inductor_max_autotune_torchbench_bfloat16_inference_cuda_performance_compilation_metrics.csv (deflated 52%) 2025-09-07T17:00:56.0031104Z ##[group]Run # Remove any previous usage logs if they exist 2025-09-07T17:00:56.0031571Z # Remove any previous usage logs if they exist 2025-09-07T17:00:56.0031953Z rm -f logs-*.zip 2025-09-07T17:00:56.0032298Z zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' || true 2025-09-07T17:00:56.0032819Z zip -r "logs-${FILE_SUFFIX}.zip" test/test-reports -i '*.log' || true 2025-09-07T17:00:56.0044545Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:56.0044937Z env: 2025-09-07T17:00:56.0045160Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:56.0045493Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:56.0046119Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:56.0046723Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:56.0047230Z DEVICE_NAME: 2025-09-07T17:00:56.0047464Z DEVICE_TYPE: 2025-09-07T17:00:56.0047836Z FILE_SUFFIX: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T17:00:56.0048284Z ##[endgroup] 2025-09-07T17:00:56.0420527Z adding: usage_log.txt (deflated 92%) 2025-09-07T17:00:56.0435764Z 2025-09-07T17:00:56.0436148Z zip error: Nothing to do! (logs-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip) 2025-09-07T17:00:56.0486996Z ##[group]Run # Remove any previous debugging artifacts if they exist 2025-09-07T17:00:56.0487534Z # Remove any previous debugging artifacts if they exist 2025-09-07T17:00:56.0487935Z rm -f debug-*.zip 2025-09-07T17:00:56.0488227Z if [ -d 'test/debug' ]; then 2025-09-07T17:00:56.0488673Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2025-09-07T17:00:56.0489027Z fi 2025-09-07T17:00:56.0500345Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:56.0500740Z env: 2025-09-07T17:00:56.0500962Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:56.0501312Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:56.0501787Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:56.0502375Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:56.0502883Z DEVICE_NAME: 2025-09-07T17:00:56.0503114Z DEVICE_TYPE: 2025-09-07T17:00:56.0503494Z FILE_SUFFIX: test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433 2025-09-07T17:00:56.0503927Z ##[endgroup] 2025-09-07T17:00:56.0913095Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T17:00:56.0913428Z with: 2025-09-07T17:00:56.0913654Z s3-bucket: gha-artifacts 2025-09-07T17:00:56.0914001Z s3-prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:56.0914361Z retention-days: 14 2025-09-07T17:00:56.0914609Z if-no-files-found: warn 2025-09-07T17:00:56.0914886Z path: test-jsons-*.zip 2025-09-07T17:00:56.0915148Z name: artifact 2025-09-07T17:00:56.0915379Z region: us-east-1 2025-09-07T17:00:56.0915601Z env: 2025-09-07T17:00:56.0915815Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:56.0916152Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:56.0916619Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:56.0917205Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:56.0917709Z DEVICE_NAME: 2025-09-07T17:00:56.0930286Z DEVICE_TYPE: 2025-09-07T17:00:56.0930510Z ##[endgroup] 2025-09-07T17:00:56.4260720Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T17:00:56.4261211Z With the provided path, there will be 1 file uploaded 2025-09-07T17:00:56.4261710Z Uploading to s3 prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:56.4269543Z Starting upload of test-jsons-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:56.5615271Z Finished upload of test-jsons-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:56.5846957Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T17:00:56.5847295Z with: 2025-09-07T17:00:56.5847529Z s3-bucket: gha-artifacts 2025-09-07T17:00:56.5847861Z s3-prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:56.5848222Z retention-days: 14 2025-09-07T17:00:56.5848471Z if-no-files-found: error 2025-09-07T17:00:56.5848756Z path: test-reports-*.zip 2025-09-07T17:00:56.5849024Z name: artifact 2025-09-07T17:00:56.5849258Z region: us-east-1 2025-09-07T17:00:56.5849476Z env: 2025-09-07T17:00:56.5849699Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:56.5850041Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:56.5850724Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:56.5851312Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:56.5851821Z DEVICE_NAME: 2025-09-07T17:00:56.5852053Z DEVICE_TYPE: 2025-09-07T17:00:56.5852280Z ##[endgroup] 2025-09-07T17:00:56.9188937Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T17:00:56.9189423Z With the provided path, there will be 1 file uploaded 2025-09-07T17:00:56.9189878Z Uploading to s3 prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:56.9197846Z Starting upload of test-reports-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:57.0448527Z Finished upload of test-reports-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:57.0750048Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T17:00:57.0750402Z with: 2025-09-07T17:00:57.0750630Z s3-bucket: gha-artifacts 2025-09-07T17:00:57.0751106Z s3-prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:57.0751455Z retention-days: 14 2025-09-07T17:00:57.0751721Z if-no-files-found: ignore 2025-09-07T17:00:57.0752000Z path: logs-*.zip 2025-09-07T17:00:57.0752241Z name: artifact 2025-09-07T17:00:57.0752462Z region: us-east-1 2025-09-07T17:00:57.0752696Z env: 2025-09-07T17:00:57.0752923Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:57.0753264Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:57.0753725Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:57.0754316Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:57.0754815Z DEVICE_NAME: 2025-09-07T17:00:57.0755043Z DEVICE_TYPE: 2025-09-07T17:00:57.0755268Z ##[endgroup] 2025-09-07T17:00:57.4061005Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T17:00:57.4061487Z With the provided path, there will be 1 file uploaded 2025-09-07T17:00:57.4061976Z Uploading to s3 prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:57.4069880Z Starting upload of logs-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:57.5301806Z Finished upload of logs-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:00:57.5518438Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T17:00:57.5518796Z with: 2025-09-07T17:00:57.5519022Z s3-bucket: gha-artifacts 2025-09-07T17:00:57.5519353Z s3-prefix: pytorch/pytorch/17525309334/1/artifact 2025-09-07T17:00:57.5519704Z retention-days: 14 2025-09-07T17:00:57.5519966Z if-no-files-found: ignore 2025-09-07T17:00:57.5520247Z path: debug-*.zip 2025-09-07T17:00:57.5520497Z name: artifact 2025-09-07T17:00:57.5520720Z region: us-east-1 2025-09-07T17:00:57.5520951Z env: 2025-09-07T17:00:57.5521165Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:57.5521512Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:57.5521974Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:57.5522585Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:57.5523277Z DEVICE_NAME: 2025-09-07T17:00:57.5523511Z DEVICE_TYPE: 2025-09-07T17:00:57.5523741Z ##[endgroup] 2025-09-07T17:00:57.8772997Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2025-09-07T17:00:57.9068246Z ##[group]Run # shellcheck disable=SC2156 2025-09-07T17:00:57.9068636Z # shellcheck disable=SC2156 2025-09-07T17:00:57.9069248Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-09-07T17:00:57.9082010Z shell: /usr/bin/bash -e {0} 2025-09-07T17:00:57.9082296Z env: 2025-09-07T17:00:57.9082504Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:57.9083020Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:57.9083496Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:57.9084108Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:57.9084601Z DEVICE_NAME: 2025-09-07T17:00:57.9084831Z DEVICE_TYPE: 2025-09-07T17:00:57.9085057Z ##[endgroup] 2025-09-07T17:00:58.4610680Z Prepare all required actions 2025-09-07T17:00:58.4611105Z Getting action download info 2025-09-07T17:00:58.5599352Z ##[group]Run ./.github/actions/upload-utilization-stats 2025-09-07T17:00:58.5599732Z with: 2025-09-07T17:00:58.5599947Z job_id: 49775768433 2025-09-07T17:00:58.5600399Z job_name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T17:00:58.5600936Z workflow_name: inductor-A100-perf-nightly 2025-09-07T17:00:58.5601278Z workflow_run_id: 17525309334 2025-09-07T17:00:58.5601562Z workflow_attempt: 1 2025-09-07T17:00:58.5601805Z env: 2025-09-07T17:00:58.5602010Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:58.5602352Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:58.5603091Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:58.5603692Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:58.5604201Z DEVICE_NAME: 2025-09-07T17:00:58.5604464Z DEVICE_TYPE: 2025-09-07T17:00:58.5604682Z ##[endgroup] 2025-09-07T17:00:58.5758974Z ##[group]Run echo "workflow_id: 17525309334" 2025-09-07T17:00:58.5759357Z echo "workflow_id: 17525309334" 2025-09-07T17:00:58.5759699Z echo "workflow_attempt: 1" 2025-09-07T17:00:58.5760115Z echo "workflow_Name: inductor-A100-perf-nightly" 2025-09-07T17:00:58.5760516Z echo "job_id: 49775768433" 2025-09-07T17:00:58.5761063Z echo "job_name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)" 2025-09-07T17:00:58.5761625Z echo "artifact_prefix: " 2025-09-07T17:00:58.5761945Z python3 --version 2025-09-07T17:00:58.5774298Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:00:58.5774702Z env: 2025-09-07T17:00:58.5774910Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:58.5775254Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:58.5775729Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:58.5776328Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:58.5776838Z DEVICE_NAME: 2025-09-07T17:00:58.5777058Z DEVICE_TYPE: 2025-09-07T17:00:58.5777378Z ##[endgroup] 2025-09-07T17:00:58.5848335Z workflow_id: 17525309334 2025-09-07T17:00:58.5848626Z workflow_attempt: 1 2025-09-07T17:00:58.5848918Z workflow_Name: inductor-A100-perf-nightly 2025-09-07T17:00:58.5849257Z job_id: 49775768433 2025-09-07T17:00:58.5849697Z job_name: cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100) 2025-09-07T17:00:58.5850205Z artifact_prefix: 2025-09-07T17:00:58.5863492Z Python 3.10.12 2025-09-07T17:00:58.6111943Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T17:00:58.6112267Z with: 2025-09-07T17:00:58.6112464Z shell: bash 2025-09-07T17:00:58.6112692Z timeout_minutes: 5 2025-09-07T17:00:58.6112940Z max_attempts: 5 2025-09-07T17:00:58.6113191Z retry_wait_seconds: 30 2025-09-07T17:00:58.6113742Z command: set -eu python3 -m pip install python-dateutil==2.8.2 boto3==1.35.42 pandas==2.1.3 dataclasses_json==0.6.7 2025-09-07T17:00:58.6114474Z polling_interval_seconds: 1 2025-09-07T17:00:58.6114768Z warning_on_retry: true 2025-09-07T17:00:58.6115043Z continue_on_error: false 2025-09-07T17:00:58.6115293Z env: 2025-09-07T17:00:58.6115517Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:00:58.6115851Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:00:58.6116318Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:00:58.6116899Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:00:58.6117404Z DEVICE_NAME: 2025-09-07T17:00:58.6117645Z DEVICE_TYPE: 2025-09-07T17:00:58.6117874Z ##[endgroup] 2025-09-07T17:00:58.9950465Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T17:00:59.3809698Z Collecting python-dateutil==2.8.2 2025-09-07T17:00:59.4478797Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2025-09-07T17:00:59.5420781Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 KB 2.6 MB/s eta 0:00:00 2025-09-07T17:01:00.2197817Z Collecting boto3==1.35.42 2025-09-07T17:01:00.2292039Z Downloading boto3-1.35.42-py3-none-any.whl (139 kB) 2025-09-07T17:01:00.2759858Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.2/139.2 KB 2.9 MB/s eta 0:00:00 2025-09-07T17:01:00.7538857Z Collecting pandas==2.1.3 2025-09-07T17:01:00.7639677Z Downloading pandas-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB) 2025-09-07T17:01:01.4119431Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.3/12.3 MB 19.7 MB/s eta 0:00:00 2025-09-07T17:01:01.4740513Z Requirement already satisfied: dataclasses_json==0.6.7 in /home/charlie/.local/lib/python3.10/site-packages (0.6.7) 2025-09-07T17:01:01.4758641Z Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil==2.8.2) (1.16.0) 2025-09-07T17:01:01.4807534Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /home/charlie/.local/lib/python3.10/site-packages (from boto3==1.35.42) (0.10.4) 2025-09-07T17:01:01.4813087Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /home/charlie/.local/lib/python3.10/site-packages (from boto3==1.35.42) (1.0.1) 2025-09-07T17:01:01.4818656Z Requirement already satisfied: botocore<1.36.0,>=1.35.42 in /home/charlie/.local/lib/python3.10/site-packages (from boto3==1.35.42) (1.35.99) 2025-09-07T17:01:01.6866472Z Collecting tzdata>=2022.1 2025-09-07T17:01:01.6952234Z Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB) 2025-09-07T17:01:01.8503403Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 347.8/347.8 KB 2.2 MB/s eta 0:00:00 2025-09-07T17:01:02.4490857Z Collecting numpy<2,>=1.22.4 2025-09-07T17:01:02.4591448Z Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB) 2025-09-07T17:01:03.0376070Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 20.4 MB/s eta 0:00:00 2025-09-07T17:01:03.3055623Z Collecting pytz>=2020.1 2025-09-07T17:01:03.3140284Z Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB) 2025-09-07T17:01:03.4662241Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 509.2/509.2 KB 3.3 MB/s eta 0:00:00 2025-09-07T17:01:03.4748406Z Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/charlie/.local/lib/python3.10/site-packages (from dataclasses_json==0.6.7) (3.26.1) 2025-09-07T17:01:03.4756146Z Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/charlie/.local/lib/python3.10/site-packages (from dataclasses_json==0.6.7) (0.9.0) 2025-09-07T17:01:03.4842801Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.42->boto3==1.35.42) (1.26.5) 2025-09-07T17:01:03.4938978Z Requirement already satisfied: packaging>=17.0 in /home/charlie/.local/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses_json==0.6.7) (25.0) 2025-09-07T17:01:03.5050723Z Requirement already satisfied: typing-extensions>=3.7.4 in /home/charlie/.local/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (4.15.0) 2025-09-07T17:01:03.5056295Z Requirement already satisfied: mypy-extensions>=0.3.0 in /home/charlie/.local/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (1.1.0) 2025-09-07T17:01:03.8016131Z Installing collected packages: pytz, tzdata, python-dateutil, numpy, pandas, boto3 2025-09-07T17:01:06.2255624Z WARNING: The script f2py is installed in '/home/charlie/.local/bin' which is not on PATH. 2025-09-07T17:01:06.2256463Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T17:01:09.8405755Z Attempting uninstall: boto3 2025-09-07T17:01:09.8412939Z Found existing installation: boto3 1.35.33 2025-09-07T17:01:09.8603050Z Uninstalling boto3-1.35.33: 2025-09-07T17:01:09.8620734Z Successfully uninstalled boto3-1.35.33 2025-09-07T17:01:10.1098925Z Successfully installed boto3-1.35.42 numpy-1.26.4 pandas-2.1.3 python-dateutil-2.8.2 pytz-2025.2 tzdata-2025.2 2025-09-07T17:01:10.6959136Z Command completed after 1 attempt(s). 2025-09-07T17:01:10.7041350Z ##[group]Run python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-09-07T17:01:10.7044669Z python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-09-07T17:01:10.7045216Z  --workflow-run-id "17525309334" \ 2025-09-07T17:01:10.7045667Z  --workflow-name "inductor-A100-perf-nightly" \ 2025-09-07T17:01:10.7046108Z  --workflow-run-attempt "1" \ 2025-09-07T17:01:10.7046471Z  --job-id "49775768433" \ 2025-09-07T17:01:10.7047039Z  --job-name "cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)" \ 2025-09-07T17:01:10.7047621Z  --local-path "" \ 2025-09-07T17:01:10.7047944Z  --artifact-prefix "" 2025-09-07T17:01:10.7061907Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T17:01:10.7062338Z env: 2025-09-07T17:01:10.7062586Z GIT_DEFAULT_BRANCH: main 2025-09-07T17:01:10.7062956Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T17:01:10.7063425Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5229 2025-09-07T17:01:10.7064051Z DOCKER_CONTAINER_ID: 3a8cbe934959e038960f3d23ae7c92622b515ff945b5e1577944c1048f2da157 2025-09-07T17:01:10.7064591Z DEVICE_NAME: 2025-09-07T17:01:10.7064849Z DEVICE_TYPE: 2025-09-07T17:01:10.7065092Z ##[endgroup] 2025-09-07T17:01:15.0803092Z repo: pytorch/pytorch 2025-09-07T17:01:15.0803466Z Search for test log in s3 bucket: ossci-utilization 2025-09-07T17:01:15.0804034Z Downloading logs-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:01:15.0804846Z extracting usage_log.txt from zip file logs-test-inductor_torchbench_perf-6-6-linux.aws.a100_49775768433.zip 2025-09-07T17:01:15.0805494Z Converted Log Model: UtilizationMetadata: 2025-09-07T17:01:15.0807037Z UtilizationMetadata(level='metadata', workflow_id='17525309334', job_id='49775768433', workflow_name='inductor-A100-perf-nightly', job_name='cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)', usage_collect_interval=4.0, data_model_version=1.5, start_at=1757235486, gpu_count=1, cpu_count=96, gpu_type='pynvml', error=None) 2025-09-07T17:01:15.0808645Z [Db Segments] detected pytest cmd: 4, generated segments: 4 2025-09-07T17:01:15.0809035Z [db model] Peek db timeseries 2025-09-07T17:01:15.0809312Z :{ 2025-09-07T17:01:15.0809522Z "created_at": 1757264474, 2025-09-07T17:01:15.0809804Z "type": "utilization", 2025-09-07T17:01:15.0810052Z "tags": [ 2025-09-07T17:01:15.0810272Z "record" 2025-09-07T17:01:15.0810493Z ], 2025-09-07T17:01:15.0810706Z "time_stamp": 1757235486, 2025-09-07T17:01:15.0810980Z "repo": "pytorch/pytorch", 2025-09-07T17:01:15.0811267Z "workflow_id": 17525309334, 2025-09-07T17:01:15.0811550Z "run_attempt": 1, 2025-09-07T17:01:15.0811798Z "job_id": 49775768433, 2025-09-07T17:01:15.0827616Z "workflow_name": "inductor-A100-perf-nightly", 2025-09-07T17:01:15.0828190Z "job_name": "cuda12.8-py3.10-gcc9-sm80 / test (inductor_torchbench_perf, 6, 6, linux.aws.a100)", 2025-09-07T17:01:15.0828713Z "json_data": "{}" 2025-09-07T17:01:15.0828952Z } 2025-09-07T17:01:15.0829456Z Writing 1 documents to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/17525309334/1/49775768433/metadata 2025-09-07T17:01:15.0830413Z Done! Finish writing document to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/17525309334/1/49775768433/metadata 2025-09-07T17:01:15.0831413Z Writing 1929 documents to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/17525309334/1/49775768433/time_series 2025-09-07T17:01:15.0832549Z Done! Finish writing document to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/17525309334/1/49775768433/time_series 2025-09-07T17:01:15.1826763Z Post job cleanup. 2025-09-07T17:01:15.1869644Z Post job cleanup. 2025-09-07T17:01:15.2864703Z [command]/usr/bin/git version 2025-09-07T17:01:15.2901141Z git version 2.51.0 2025-09-07T17:01:15.2944905Z Temporarily overriding HOME='/home/charlie/_work/_temp/66968f0b-6313-4ed8-baef-2fcebab9ca5b' before making global git config changes 2025-09-07T17:01:15.2945824Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T17:01:15.2949896Z [command]/usr/bin/git config --global --add safe.directory /home/charlie/_work/pytorch/pytorch 2025-09-07T17:01:15.2983829Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T17:01:15.3022068Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T17:01:15.3253253Z Entering 'android/libs/fbjni' 2025-09-07T17:01:15.3296503Z Entering 'third_party/FP16' 2025-09-07T17:01:15.3338623Z Entering 'third_party/FXdiv' 2025-09-07T17:01:15.3381056Z Entering 'third_party/NNPACK' 2025-09-07T17:01:15.3423398Z Entering 'third_party/NVTX' 2025-09-07T17:01:15.3466158Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T17:01:15.3508828Z Entering 'third_party/XNNPACK' 2025-09-07T17:01:15.3567306Z Entering 'third_party/aiter' 2025-09-07T17:01:15.3610065Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T17:01:15.3662475Z Entering 'third_party/benchmark' 2025-09-07T17:01:15.3704311Z Entering 'third_party/composable_kernel' 2025-09-07T17:01:15.3757022Z Entering 'third_party/cpp-httplib' 2025-09-07T17:01:15.3799400Z Entering 'third_party/cpuinfo' 2025-09-07T17:01:15.3842553Z Entering 'third_party/cudnn_frontend' 2025-09-07T17:01:15.3883832Z Entering 'third_party/cutlass' 2025-09-07T17:01:15.3934878Z Entering 'third_party/fbgemm' 2025-09-07T17:01:15.3978993Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T17:01:15.4018917Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T17:01:15.4067901Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T17:01:15.4108327Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T17:01:15.4160449Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T17:01:15.4199755Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T17:01:15.4240418Z Entering 'third_party/fbgemm/external/json' 2025-09-07T17:01:15.4284477Z Entering 'third_party/flash-attention' 2025-09-07T17:01:15.4326216Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T17:01:15.4371877Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T17:01:15.4424315Z Entering 'third_party/flatbuffers' 2025-09-07T17:01:15.4468964Z Entering 'third_party/fmt' 2025-09-07T17:01:15.4511354Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T17:01:15.4553075Z Entering 'third_party/gloo' 2025-09-07T17:01:15.4595221Z Entering 'third_party/googletest' 2025-09-07T17:01:15.4637218Z Entering 'third_party/ideep' 2025-09-07T17:01:15.4678760Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T17:01:15.4727242Z Entering 'third_party/ittapi' 2025-09-07T17:01:15.4769395Z Entering 'third_party/kineto' 2025-09-07T17:01:15.4811219Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T17:01:15.4851564Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T17:01:15.4893348Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T17:01:15.4934101Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T17:01:15.4974444Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T17:01:15.5014187Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T17:01:15.5056114Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T17:01:15.5096991Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T17:01:15.5137694Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T17:01:15.5178737Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T17:01:15.5221745Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T17:01:15.5261313Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T17:01:15.5303515Z Entering 'third_party/kleidiai' 2025-09-07T17:01:15.5346667Z Entering 'third_party/mimalloc' 2025-09-07T17:01:15.5388908Z Entering 'third_party/nlohmann' 2025-09-07T17:01:15.5433405Z Entering 'third_party/onnx' 2025-09-07T17:01:15.5493759Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T17:01:15.5537267Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T17:01:15.5582063Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T17:01:15.5622505Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T17:01:15.5663362Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T17:01:15.5704442Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T17:01:15.5745706Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T17:01:15.5786256Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T17:01:15.5826980Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T17:01:15.5866528Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T17:01:15.5908452Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T17:01:15.5950812Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T17:01:15.6010548Z Entering 'third_party/pocketfft' 2025-09-07T17:01:15.6052609Z Entering 'third_party/protobuf' 2025-09-07T17:01:15.6097919Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T17:01:15.6137048Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T17:01:15.6179835Z Entering 'third_party/psimd' 2025-09-07T17:01:15.6222911Z Entering 'third_party/pthreadpool' 2025-09-07T17:01:15.6264886Z Entering 'third_party/pybind11' 2025-09-07T17:01:15.6307119Z Entering 'third_party/python-peachpy' 2025-09-07T17:01:15.6349678Z Entering 'third_party/sleef' 2025-09-07T17:01:15.6391016Z Entering 'third_party/tensorpipe' 2025-09-07T17:01:15.6434021Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T17:01:15.6473968Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T17:01:15.6514140Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T17:01:15.6554456Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T17:01:15.6593723Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T17:01:15.6653381Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T17:01:15.6675628Z http.https://github.com/.extraheader 2025-09-07T17:01:15.6684737Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-09-07T17:01:15.6714202Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T17:01:15.6938862Z Entering 'android/libs/fbjni' 2025-09-07T17:01:15.6962542Z http.https://github.com/.extraheader 2025-09-07T17:01:15.6993715Z Entering 'third_party/FP16' 2025-09-07T17:01:15.7017351Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7049307Z Entering 'third_party/FXdiv' 2025-09-07T17:01:15.7072756Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7104245Z Entering 'third_party/NNPACK' 2025-09-07T17:01:15.7128190Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7158853Z Entering 'third_party/NVTX' 2025-09-07T17:01:15.7182669Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7214974Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T17:01:15.7238528Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7268401Z Entering 'third_party/XNNPACK' 2025-09-07T17:01:15.7293607Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7339795Z Entering 'third_party/aiter' 2025-09-07T17:01:15.7364626Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7395774Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T17:01:15.7418457Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7458348Z Entering 'third_party/benchmark' 2025-09-07T17:01:15.7482416Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7515163Z Entering 'third_party/composable_kernel' 2025-09-07T17:01:15.7539861Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7580411Z Entering 'third_party/cpp-httplib' 2025-09-07T17:01:15.7604702Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7634839Z Entering 'third_party/cpuinfo' 2025-09-07T17:01:15.7658860Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7690518Z Entering 'third_party/cudnn_frontend' 2025-09-07T17:01:15.7714463Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7745219Z Entering 'third_party/cutlass' 2025-09-07T17:01:15.7770171Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7812139Z Entering 'third_party/fbgemm' 2025-09-07T17:01:15.7836598Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7870129Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T17:01:15.7893310Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7923410Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T17:01:15.7946786Z http.https://github.com/.extraheader 2025-09-07T17:01:15.7983677Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T17:01:15.8007201Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8037264Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T17:01:15.8060880Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8101434Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T17:01:15.8124703Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8154513Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T17:01:15.8177156Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8207145Z Entering 'third_party/fbgemm/external/json' 2025-09-07T17:01:15.8230467Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8263778Z Entering 'third_party/flash-attention' 2025-09-07T17:01:15.8288349Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8319241Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T17:01:15.8341946Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8378894Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T17:01:15.8401348Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8441272Z Entering 'third_party/flatbuffers' 2025-09-07T17:01:15.8465665Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8500083Z Entering 'third_party/fmt' 2025-09-07T17:01:15.8524973Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8554762Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T17:01:15.8578619Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8609089Z Entering 'third_party/gloo' 2025-09-07T17:01:15.8633015Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8663769Z Entering 'third_party/googletest' 2025-09-07T17:01:15.8688473Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8718988Z Entering 'third_party/ideep' 2025-09-07T17:01:15.8743005Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8772849Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T17:01:15.8795542Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8834282Z Entering 'third_party/ittapi' 2025-09-07T17:01:15.8859307Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8889607Z Entering 'third_party/kineto' 2025-09-07T17:01:15.8913957Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8943882Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T17:01:15.8967348Z http.https://github.com/.extraheader 2025-09-07T17:01:15.8997224Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T17:01:15.9020216Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9053728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T17:01:15.9076599Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9106953Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T17:01:15.9130923Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9161404Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T17:01:15.9185617Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9214985Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T17:01:15.9238181Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9271294Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T17:01:15.9294158Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9324738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T17:01:15.9348257Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9378072Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T17:01:15.9401156Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9433898Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T17:01:15.9457427Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9490365Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T17:01:15.9513138Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9543176Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T17:01:15.9566737Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9598099Z Entering 'third_party/kleidiai' 2025-09-07T17:01:15.9622447Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9652849Z Entering 'third_party/mimalloc' 2025-09-07T17:01:15.9676862Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9706871Z Entering 'third_party/nlohmann' 2025-09-07T17:01:15.9731594Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9763797Z Entering 'third_party/onnx' 2025-09-07T17:01:15.9787590Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9835366Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T17:01:15.9858896Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9892603Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T17:01:15.9916892Z http.https://github.com/.extraheader 2025-09-07T17:01:15.9948530Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T17:01:15.9971516Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0002037Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T17:01:16.0025564Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0055934Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T17:01:16.0078765Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0108474Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T17:01:16.0131465Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0163180Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T17:01:16.0186465Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0216489Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T17:01:16.0239619Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0269096Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T17:01:16.0292497Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0321240Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T17:01:16.0344768Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0377103Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T17:01:16.0399811Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0431566Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T17:01:16.0454695Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0505887Z Entering 'third_party/pocketfft' 2025-09-07T17:01:16.0530042Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0561921Z Entering 'third_party/protobuf' 2025-09-07T17:01:16.0586450Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0620374Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T17:01:16.0643681Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0673480Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T17:01:16.0697208Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0729555Z Entering 'third_party/psimd' 2025-09-07T17:01:16.0757240Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0787132Z Entering 'third_party/pthreadpool' 2025-09-07T17:01:16.0811429Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0842417Z Entering 'third_party/pybind11' 2025-09-07T17:01:16.0867517Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0897991Z Entering 'third_party/python-peachpy' 2025-09-07T17:01:16.0922309Z http.https://github.com/.extraheader 2025-09-07T17:01:16.0952746Z Entering 'third_party/sleef' 2025-09-07T17:01:16.0976427Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1008205Z Entering 'third_party/tensorpipe' 2025-09-07T17:01:16.1032133Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1063196Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T17:01:16.1086909Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1119473Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T17:01:16.1139638Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1170326Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T17:01:16.1192737Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1223395Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T17:01:16.1246831Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1275304Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T17:01:16.1298094Z http.https://github.com/.extraheader 2025-09-07T17:01:16.1457898Z Post job cleanup. 2025-09-07T17:01:16.2409486Z [command]/usr/bin/git version 2025-09-07T17:01:16.2443198Z git version 2.51.0 2025-09-07T17:01:16.2482435Z Temporarily overriding HOME='/home/charlie/_work/_temp/7e8e547b-cc9d-4414-a3ec-6ba49d00fa92' before making global git config changes 2025-09-07T17:01:16.2483450Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T17:01:16.2487456Z [command]/usr/bin/git config --global --add safe.directory /home/charlie/_work/pytorch/pytorch 2025-09-07T17:01:16.2517491Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T17:01:16.2554150Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T17:01:16.2782582Z Entering 'android/libs/fbjni' 2025-09-07T17:01:16.2825743Z Entering 'third_party/FP16' 2025-09-07T17:01:16.2868440Z Entering 'third_party/FXdiv' 2025-09-07T17:01:16.2911160Z Entering 'third_party/NNPACK' 2025-09-07T17:01:16.2953093Z Entering 'third_party/NVTX' 2025-09-07T17:01:16.2995832Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T17:01:16.3038315Z Entering 'third_party/XNNPACK' 2025-09-07T17:01:16.3097037Z Entering 'third_party/aiter' 2025-09-07T17:01:16.3140481Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T17:01:16.3190824Z Entering 'third_party/benchmark' 2025-09-07T17:01:16.3232886Z Entering 'third_party/composable_kernel' 2025-09-07T17:01:16.3285625Z Entering 'third_party/cpp-httplib' 2025-09-07T17:01:16.3327594Z Entering 'third_party/cpuinfo' 2025-09-07T17:01:16.3369752Z Entering 'third_party/cudnn_frontend' 2025-09-07T17:01:16.3412033Z Entering 'third_party/cutlass' 2025-09-07T17:01:16.3462550Z Entering 'third_party/fbgemm' 2025-09-07T17:01:16.3507936Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T17:01:16.3548234Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T17:01:16.3596617Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T17:01:16.3637932Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T17:01:16.3686135Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T17:01:16.3725609Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T17:01:16.3765459Z Entering 'third_party/fbgemm/external/json' 2025-09-07T17:01:16.3808510Z Entering 'third_party/flash-attention' 2025-09-07T17:01:16.3851280Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T17:01:16.3899386Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T17:01:16.3949822Z Entering 'third_party/flatbuffers' 2025-09-07T17:01:16.3994284Z Entering 'third_party/fmt' 2025-09-07T17:01:16.4036448Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T17:01:16.4077719Z Entering 'third_party/gloo' 2025-09-07T17:01:16.4119401Z Entering 'third_party/googletest' 2025-09-07T17:01:16.4161107Z Entering 'third_party/ideep' 2025-09-07T17:01:16.4201784Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T17:01:16.4248965Z Entering 'third_party/ittapi' 2025-09-07T17:01:16.4290907Z Entering 'third_party/kineto' 2025-09-07T17:01:16.4331806Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T17:01:16.4371344Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T17:01:16.4413567Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T17:01:16.4453176Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T17:01:16.4494048Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T17:01:16.4534247Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T17:01:16.4575375Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T17:01:16.4615979Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T17:01:16.4655619Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T17:01:16.4696738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T17:01:16.4738608Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T17:01:16.4778481Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T17:01:16.4820603Z Entering 'third_party/kleidiai' 2025-09-07T17:01:16.4862824Z Entering 'third_party/mimalloc' 2025-09-07T17:01:16.4905486Z Entering 'third_party/nlohmann' 2025-09-07T17:01:16.4949006Z Entering 'third_party/onnx' 2025-09-07T17:01:16.5010303Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T17:01:16.5053504Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T17:01:16.5096226Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T17:01:16.5135572Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T17:01:16.5175098Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T17:01:16.5214576Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T17:01:16.5255000Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T17:01:16.5294341Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T17:01:16.5334301Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T17:01:16.5373037Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T17:01:16.5413913Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T17:01:16.5455806Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T17:01:16.5517164Z Entering 'third_party/pocketfft' 2025-09-07T17:01:16.5558379Z Entering 'third_party/protobuf' 2025-09-07T17:01:16.5603169Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T17:01:16.5642294Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T17:01:16.5684595Z Entering 'third_party/psimd' 2025-09-07T17:01:16.5725952Z Entering 'third_party/pthreadpool' 2025-09-07T17:01:16.5768204Z Entering 'third_party/pybind11' 2025-09-07T17:01:16.5809881Z Entering 'third_party/python-peachpy' 2025-09-07T17:01:16.5852576Z Entering 'third_party/sleef' 2025-09-07T17:01:16.5893755Z Entering 'third_party/tensorpipe' 2025-09-07T17:01:16.5934548Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T17:01:16.5973632Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T17:01:16.6012552Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T17:01:16.6052289Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T17:01:16.6091631Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T17:01:16.6150208Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T17:01:16.6180182Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T17:01:16.6400216Z Entering 'android/libs/fbjni' 2025-09-07T17:01:16.6440863Z Entering 'third_party/FP16' 2025-09-07T17:01:16.6482249Z Entering 'third_party/FXdiv' 2025-09-07T17:01:16.6524046Z Entering 'third_party/NNPACK' 2025-09-07T17:01:16.6565702Z Entering 'third_party/NVTX' 2025-09-07T17:01:16.6607227Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T17:01:16.6648793Z Entering 'third_party/XNNPACK' 2025-09-07T17:01:16.6705122Z Entering 'third_party/aiter' 2025-09-07T17:01:16.6747343Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T17:01:16.6795832Z Entering 'third_party/benchmark' 2025-09-07T17:01:16.6837578Z Entering 'third_party/composable_kernel' 2025-09-07T17:01:16.6887210Z Entering 'third_party/cpp-httplib' 2025-09-07T17:01:16.6928646Z Entering 'third_party/cpuinfo' 2025-09-07T17:01:16.6970792Z Entering 'third_party/cudnn_frontend' 2025-09-07T17:01:16.7011771Z Entering 'third_party/cutlass' 2025-09-07T17:01:16.7061790Z Entering 'third_party/fbgemm' 2025-09-07T17:01:16.7105300Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T17:01:16.7145371Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T17:01:16.7191199Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T17:01:16.7230955Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T17:01:16.7280696Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T17:01:16.7320039Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T17:01:16.7359816Z Entering 'third_party/fbgemm/external/json' 2025-09-07T17:01:16.7402471Z Entering 'third_party/flash-attention' 2025-09-07T17:01:16.7444896Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T17:01:16.7490441Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T17:01:16.7539583Z Entering 'third_party/flatbuffers' 2025-09-07T17:01:16.7583353Z Entering 'third_party/fmt' 2025-09-07T17:01:16.7625264Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T17:01:16.7667720Z Entering 'third_party/gloo' 2025-09-07T17:01:16.7709261Z Entering 'third_party/googletest' 2025-09-07T17:01:16.7750884Z Entering 'third_party/ideep' 2025-09-07T17:01:16.7791537Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T17:01:16.7837863Z Entering 'third_party/ittapi' 2025-09-07T17:01:16.7878966Z Entering 'third_party/kineto' 2025-09-07T17:01:16.7919950Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T17:01:16.7959588Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T17:01:16.8001481Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T17:01:16.8041371Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T17:01:16.8080616Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T17:01:16.8119198Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T17:01:16.8160492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T17:01:16.8200334Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T17:01:16.8240647Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T17:01:16.8281613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T17:01:16.8324130Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T17:01:16.8363291Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T17:01:16.8404663Z Entering 'third_party/kleidiai' 2025-09-07T17:01:16.8447963Z Entering 'third_party/mimalloc' 2025-09-07T17:01:16.8489528Z Entering 'third_party/nlohmann' 2025-09-07T17:01:16.8532519Z Entering 'third_party/onnx' 2025-09-07T17:01:16.8592685Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T17:01:16.8635326Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T17:01:16.8678665Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T17:01:16.8718181Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T17:01:16.8757416Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T17:01:16.8796051Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T17:01:16.8836517Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T17:01:16.8875545Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T17:01:16.8914509Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T17:01:16.8953206Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T17:01:16.8994620Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T17:01:16.9036091Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T17:01:16.9097917Z Entering 'third_party/pocketfft' 2025-09-07T17:01:16.9138862Z Entering 'third_party/protobuf' 2025-09-07T17:01:16.9182787Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T17:01:16.9222471Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T17:01:16.9264355Z Entering 'third_party/psimd' 2025-09-07T17:01:16.9305710Z Entering 'third_party/pthreadpool' 2025-09-07T17:01:16.9347266Z Entering 'third_party/pybind11' 2025-09-07T17:01:16.9388284Z Entering 'third_party/python-peachpy' 2025-09-07T17:01:16.9429677Z Entering 'third_party/sleef' 2025-09-07T17:01:16.9471772Z Entering 'third_party/tensorpipe' 2025-09-07T17:01:16.9512348Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T17:01:16.9551342Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T17:01:16.9591030Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T17:01:16.9630676Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T17:01:16.9668965Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T17:01:16.9849388Z Cleaning up orphan processes